Hi, I'm Evan, developer of Modern CSV. I started developing this out of frustration with how a certain spreadsheet program handles CSV files. I took inspiration from the text editors we've all come to love and this is the result. It's available for Windows, Mac, and Linux. Here are some of the features that make it stand out:
- Multiple cell/row/column editing
- Easy navigation between files
- Keyboard shortcut customization
- Command palette
- Read-only mode for super large files
With version 2, I'm adding some basic data analysis tools, some new themes, M1 compatibility for Mac users, and a whole bunch of editing commands, most of which are user-requested (well, technically all since I use my own product). The current beta version will work until June 25. It includes all Premium features without the need for a license. I'll be happy to hear any feedback you have about it!
I'm looking forward to trying this, learned the hard way that Excel will make unwanted changes without notice to CSV files. Like SSNs that start with zero having there leading zeros silently removed, breaking things.
I'm convinced that 90% of the people who rail against CSV for "not being a real format" are the ones that got burned by Excel. It has always shocked me how terrible Excel is at handling CSV given the sheer number of people who have this use case.
The crazy part is that it wouldn't be all that hard to handle it properly. Excel could examine each column and apply a uniform transform on each column instead of applying transforms on a cell by cell basis. They could even put in real effort and let the user choose the format for each column as part of the import process. You know, like being able to specify "text" for columns like SSNs or Credit cards that you aren't going to do math on anyway.
We've had bugs raised against our software because the clients use meeting titles like 1-2-1 for supervisory reviews which when extracted for reporting and opened in Excel (they love to dump data into Excel no matter what report functions you include directly in the application) get interpreted as a date even when output as a quoted string in the CSV file. Try explaining to a client that we have formatted it correctly and Excel is reading it wrong…
(We could of course output Excel files directly, but some of them can't download office documents from web apps because of security policy at their end.)
Given how badly common tools mangle unambiguously correct CSV data, how many variations there are which make “unambiguously correct CSV data” a somewhat small proportion of what is out there, and how many tools not only expect but require mis-formatted data and/or output it, it is scary how much the format is relied upon in major industries.
"Given how badly common tools mangle unambiguously correct CSV data, how many variations there are which make “unambiguously correct CSV data” a somewhat small proportion of what is out there, and how many tools not only expect but require mis-formatted data and/or output it, it is scary how much the format is relied upon in major industries."
In a nutshell, CSV isn't a format. It's a family of formats, and it's not even a well-specified family of formats.
At least in semi-technical circles, I've had some success in using this to push back against CSV suggestions and get them to use better things. I'm sure that in non-technical circles I'd have zero success with this, though. It sure ain't a magic talisman you can use.
JSON isn't exactly a rigidly specified format, but it's got a lot less flex in it and I've not had as much trouble with it. Biggest problem I have is just getting people using dynamic scripting languages to please output either a string or a number, but don't just output "whatever the scripting language happened to decide based on what code paths I happened to run" when you don't even realize your code ends up casting it back and forth without you knowing and what comes out is effectively random from my point of view.
> In a nutshell, CSV isn't a format. It's a family of formats, and it's not even a well-specified family of formats.
There is RFC4180. Though by 2005 when that came about there were already so many different cases around that it became just one of a great many possible variants.
I try not to push back too hard about CSV, for fear of “well, there is this XML format that is supported”! (bad enough in itself, but sometimes the “XML format” is even more poorly specified than the client's CSV edge cases which we are expected to guess).
JSON is nice as long, as you say, that strings are real strings and numbers are real numbers.
Oh, and dates/times are in an RFC3339 (or ISO8601) numeric (no localised month names, etc.) format either in UTC or with the timezone always specified, as strings (though at a pinch I'll accept a posix time_t for datetime if based on UTC). Not specifying how to handle dates/times/both is the major problem with JSON in my experience.
That used to be the dominant model for getting software exposed to as many users as possible, video games or not. It’s the classic shareware/trialware model. Although, with internet downloads, the “sharing” aspect isn’t really that important anymore.
Oh man, I just realized/remembered that shareware was meant to be shared from one person to the other, as in "I have this cool software, here it is on a floppy", and then you had to contact the developer for a license.
I tried this and it couldn't open a 6MB CSV file with 1920 columns and 1080 rows. Or maybe it could sometime if I decide that I can wait long enough. Notepad++ can open this file almost instantly.
I figured out it. It's due to the Auto-Fit Column Width setting. By default, it's enabled. Most files don't have a ton of columns so it's fine, but for 1920 columns, it can really slow things down. When I loaded the file with the setting enabled, it took about 40 seconds. When I tried again with the setting disabled, it took about 1 second.
For now, I may put a band-aid on it with a popup asking if you want to disable the feature when there are a lot of columns. I have some ideas on how to make it more efficient. I'll see what I can do before the full release.
Thanks. I changed the setting, but it didn't help much. It's smart to optimize for small files, since those are probably 99% of your use cases.
You might want to remove the part of your web site that says "View Large Files Quickly" because I was very excited when I read this and then very disappointed when it wasn't fast at all.
Notepad++ can open the file instantly and allows me to move around very quickly. You'd probably need something close that level of performance before you can claim that your product is fast for large files.
You have to change the setting in the Settings file (Edit Settings command) under the "User Value" column. Changing it under the "Default Value" column is a common mistake that's really my fault, so I intend to rectify it soon.
If it still doesn't perform like that for you, let me know.
It's handled much larger files than that, so if it's something you don't mind sharing, you can send the file at https://www.moderncsv.com/report-a-bug. It is a beta version, so I'm trying to fix all these issues now.
Edit: On second thought, it's probably grayscale intensity hex values of a 1920x1080 image. I can reproduce that myself. Feel free to send me your file if you want, but it might not be necessary.
I've been using ModernCSV for a while to handle readonly CSVs (mainly for sanity checking) and it's been fantastic.
Version 2 looks great, it's a hard balance to add features and not become bloated so will be interesting to see what it's like going forward.
(I love the product, not affiliated, just been using it every day since November 2020 as a happily paid user and recommend it to anyone who has the "pleasure" of working with CSVs)
I'm not sure I see a huge advantage that can be obtained via tight integration that you can't get by simply clicking "Save" and then ALT-Tab-ing to your Python script/notebook/whatever and re-running it, targeted at the file you just saved. If you're feeling really feisty, see about integrating a file watch program to do it as soon as you save. It's not like the editor is going to be using a pandas layout internally it can just pass to Python without marshaling.
At some point (maybe version 3), I'd like to integrate Python to allow users to make their own commands. This would give them access to entire rows, columns, ranges, etc.
This app almost in its entirety has been a growing note in my idea vault for years now due to identical frustrations. Thank you! Very excited to try it.
Just for fun (and because I don't like unsourced claims) I fired up Mathematica to sum up the population of the countries listed. Here are the results:
It's not accurate because I'm not taking into account things such as "only in French Canada" or "only in currency", so some countries are double counted. But it gives a rough estimate. Five billion people for the dot, two for the comma.
This new version has a Convert Number Format command that can be used to convert between the two formats. It just occurred to me I didn't take that into account for the new statistics operations. I'll be sure to fix it.
First I've heard of this tool. I've been using the Python-based csvkit and the Java-based OpenRefine (formerly Google Refine) to work with and clean CSVs for years. Thanks for the recommendation.
This is really close to what I've been wanting to build for a while. A sort of excel for programmers with the added ability to write real lua or python code in each cell instead of "formulas".
BTW, I'm suprised this is written in Qt. Looks very modern Windows, in a good way of course :^)
Excel formulas are worst abominations of semi-programming. My coworker took advanced excel courses and I sometimes just can’t help her with these monstrosities because everything around them sucks if it’s something more than a simple max-3-term expression. If it was just python or lua, I’d cast few spells and she’d understand and use them much more easily.
I've been meaning to check out sc-im ("Spreadsheet Calculator Improvised" https://github.com/andmarti1424/sc-im) for a while now. Among its listed features:
> GNUPlot interaction.
> Scripting support with LUA. Also with triggers and c dynamic linked modules.
> Implement external functions in the language you prefer and use them in SC-IM.
Does anyone here have experience using it? Is it stable and reliable, and does it handle large files well?
You and I think alike because that was my initial thought. The practical considerations make it tricky. I may still do it, but I first plan to embed Python to allow users to implement their own commands.
For Emacs users there is a nice csv-mode on ELPA. It has aligned display, field based navigation and some basic sorting and manipulation options. And you have access to all the usual Emacs goodies of course.
Another option is to import into an org-mode table and then re-export as csv when needed.
Has anybody figured out a convenient way to post tabular data in a Slack message that doesn't look horrible? I usually use an "ASCII Table" tool to format the CSV and then post the results in a code block, but it's a PITA.
This is nicely done. Finding a reasonable csv editor was a 'white whale' I searched for for some years. I found that with EmEditor (commercial, I'm not associated with them), the killer feature is swapping between normal text editor and csv cell based editing seamlessly. I still tell anyone I see dealing with csv's about it. I wish the macros weren't in their own language, but performance on large files is crazy fast.
Looks great! It took me a few tries to understand how the Settings work and I would explain it in the help if I were you. I was trying to change the theme from dark to light and took me a while to understand that I had to add the new value on the righmost cell instead of changing the default value column :)
In my humble opinion the light theme would be better as a default because I am used to Excel. I suppose most Excel users would say the same.
That seems to be a common issue. I put "!!! Make changes below !!!" above the User Value column, but if you scroll straight down to the Theme row, you may miss that. Perhaps I could swap the User Value and Default Value column or pop up a message if the user edits the Default Value column.
That was part of the idea. Something you might like is you can navigate with alt+i/j/k/l instead of the arrow keys. That fraction of a second you won't have to move your wrist to the arrow keys and back adds up.
Looking at their website, I have in mind some of the things they do, but some of my ideas might have to go into a separate product. I'm careful not to go too far beyond what can fairly be called a CSV editor.
Yes, it's configurable. It has an auto-detect feature, but if it guesses wrong, you can reload the file with a different delimiter. You can also set a different delimiter to save. In this version, you can change it for an individual file with the File Metadata feature. Also, one of the preference files is called File Extension Options. That may help too. If you need help getting it to work like you want, let me know.
I do. I have phone numbers and more extensive number transformations on my to-do list. Those probably won't make it in there for version 2.0, but maybe 2.1 or 2.2.
An opposing viewpoint: please don't add any of these suggested features. I prefer the programs that manipulate CSV to only express the actual contents of the file. Formulas are the antithesis of that. The beauty, and utility, of CSV is its simplicity. All of the requested functionality can be easily performed in SQL, Excel, Pandas, R, Krangl, etc. Please keep formulas out of a CSV editor.
I agree with that purity argument, but I wonder if it really has to preclude all spreadsheet calculator ambitions: just take the "multiple files project" approach and make it good. One csv for raw data, one csv for configuration/formulas, one csv for intermediates/results.
Make full use of whatever awesome multifile read/edit the software presumably has already and then maybe go a little beyond. The "configuration" csv might have some hashbang equivalent defining a line offset for the header that isn't mapped into input/output files (perhaps a concept of "negative line numbers" for project properties?), contains references to those peer files and maybe activates project options like "display peer file contents in cells that are otherwise empty" if you want your working surface to remain "2d".
The argument to have this add-on (I'm not sure that it should be part of the product or a separate extension) would be that it would be a shame to learn all the UI ergonomics of the plain reader/editor and then not be able to leverage them for some calculations as well.
I am not talking about per cell formulas like in Excel. I am talking about a basic expression language that operates on the entire column, more like a database. Modern CSV already has some basic data transformation facilities, I'm only suggesting a few more in the same vein that make it a more potent data editing tool.
Possibly the most efficient and effective method here would be to integrate Lua into the internal data structure used for cells, and let replwoacause do whatever they like with a scripting language. I can definitely see how that would be useful to people.
I would actually be wary of trying to make this too much like a spreadsheet, even in appearance. You'll be pulled into that strange attractor, but you can't hope to compete in that space. Whatever oxygen isn't consumed by Excel itself is long gone between LibreOffice and online spreadsheets.
Happy paying Modern CSV user here and I say Yes please to item 4.
I recently was given a pile of hand-edited CSV files (a one off data transfer between two systems). The original export was missing one column, so being able to merge that in would have been very useful.
I'm sure with Pandas if you know the magic incantation it's easy to do (join two files on the ID column, and add column X from file 2 to file 1) but I wrote my own script for these files.
In others I've resorted to opening the CSV in Excel, doing an XLOOKUP and then fixing up the mess Excel created in Modern CSV afterwards.
(To the sibling commenter saying you can do it with SQL: yes and I'd love to. But how do you pipeline that with multiple CSVs that may have slightly different column names? If done manually it seems quicker to do it my way)
Thanks, I'll look for that next time. The problem with Excel and these CSV files was it mangled the date column, which I only found out about after a bunch had been fully edited. I used modern CSV to rescue this and so have been suspicous of going back to Excel to do any further manipulations.
The good news is PowerQuery has data types! Although it is not as flexible in editing particular records if that is a requirement (just bulk edits, joins and merges)
Isn't the whole point to have a lightweight system for viewing and basic manipulations of csv files, so that you could safely import them into more powerful systems that let you do complex transformations, visualizations,... ?
At this point why wouldn't I just go straight into one of the more powerful systems then? They all have sufficient viewing capabilities as well. The idea behind adding more utility functions to transform data is to make Modern CSV a nimble but capable CSV tool without the need for a more robust solution (like Power BI, Tableau, etc.)
Because csv import into Excel can be a bit of a nightmare if the data is formatted in a way not covered by the import wizards. I've written Python scripts before to format data so that Excel would parse it correctly (which tbf was mostly the fault of the party producing said files), I wouldn't have minded having a light-weight visual tool for that.
I just looked into it. On desktop, there's a tab after the user hits the accept button for managing consent, but the plugin removes it in mobile for UX reasons.
neat. python or c++? also I tried to enter in my email into where it says "Get useful tips and updates in your inbox." and it is stuck saying "processing"
It can handle about 2 billion rows and columns. Aside from that, it's limited by your machine's memory. If you open a file in edit mode, the memory footprint is about 3-4x the file size. If you open in read-only mode, it's only a fraction of the file size (maybe 20-25% or so).
Hmm, why did you build it as a desktop app in 2022? Seems like a no brainer to have this in the cloud so that you can seamlessly process huge files and integrate with various third-party APIs. I've used https://www.gigasheet.com/ for this before (which is cloud-based). And there is also the open source visidata project.
Definitely not a no-brainer. Don't want my CSVs integrating with APIs (or in the cloud at all), and I'm confident the desktop version handles large files better than a cloud solution would, as well as being far more responsive, quick, compact than any web-based solution.
I have zero interest in using business data with some online service that nobody else at my company has vetted. I vastly prefer being able to try and use a desktop application where I can be sure sensitive data is where I have 100% control over it.
I'm absolutely positive that you can make a desktop app a lot more snappy and usable than any browser-based solution. Especially if the point is to edit files on your computer – no way I'd want to upload a 100MB CSV to cloud before even having a sneak peek, but a (good) desktop app would allow a quick sneak peek of even a 10GB CSV.
Not everyone wants their solution hosted in the cloud. I would much rather have this as a desktop app that I can own than yet another SaaS product that I have to rent...
- Multiple cell/row/column editing - Easy navigation between files - Keyboard shortcut customization - Command palette - Read-only mode for super large files
With version 2, I'm adding some basic data analysis tools, some new themes, M1 compatibility for Mac users, and a whole bunch of editing commands, most of which are user-requested (well, technically all since I use my own product). The current beta version will work until June 25. It includes all Premium features without the need for a license. I'll be happy to hear any feedback you have about it!