Hacker News new | past | comments | ask | show | jobs | submit login

Its a crying shame we never settled on a control character separated text format. There's a ascii control characters for record and field (unit) separators. A bit of user space support for that would have been great.



As I recall, you can tell Awk to use the control characters as record and field separators. Not helpful if you're getting your data from others, but if you're working by yourself, you have the option. I've come to use control characters as a default because it makes life so much easier.


What do you recommend for viewing and editing such files?


Visidata works with arbitrary separators. I just tried with a CSV separated with ␟ (ASCII unit separator) and it worked just fine.


Excel too?


lolive, VisiData has some Excel support. However, don't expect VisiData to be a full blown editor for Excel files. It can provide a view of the data in an Excel spreadsheet.


I think GP means to ask if Excel can read files with ASCII 29,30,31 separators.


If you have a python installation available, openpyxl[1] is great both for converting to .csv and for packaging .csv outputs as .xlsx (which is really zipped .xml, anyway).

[1] https://openpyxl.readthedocs.io/en/stable/index.html


It is a shame. I have been using tab-separated sheets recently as it allows me to simply not care about almost any possible character in my strings...apart from tabs of course. But those are far less common than commas, and putting strings in quotes 100% of the time looks messy to me.


Way less common would be using ascii 30 and ascii 31. ascii 29 and you can cram multiple datasets into one file


Some discussion of that here: https://news.ycombinator.com/item?id=31220841

To be really useful as a format it would just need for text editors to: -display something distinct for the field separator (some editors do this) -treat the record separator character like a carriage return (not aware of any editors that do this)


>To be really useful as a format it would just need for text editors to: -display something distinct for the field separator

Which would be trivial too.


The programming might be straightforward. Trying to persuade the product owners to do it is a different matter.


Pull request?


Yeah, that would work /s


After you. ;0)


> To be really useful as a format it would just need for text editors to

This made me think of WordPerfect's "reveal codes" functionality. :)

(Word's "Reveal Formatting" is supposedly similar.)


Right. The issue is the user space support at the end of the day.


Tab-delimited "csv" formats are quite common (e.g. the CONLL format family for many natural language processing tasks) and also supported by common tools such as MS Excel for decades already.



Most important comment I have ever read on HN ever !




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: