python - read_table error while reading .idx file -
i'm trying read .idx file 1.89gb in size. if write:
indexfile=pd.read_table("c:\edgar zip files\2001\company.idx")
i output as:
- company name form type cik date filed file name
- 0 033 asset management llc / ...
- 1 033 asset management llc / ...
- 2 1 800 contacts inc ...
- 3 1 800 contacts inc ...
- 4 1 800 flowers com inc ...
where columns merged in single column
if do:
indexfile=pd.read_table("c:\edgar zip files\2001\company.idx",sep=" ")
i error:
cparsererror: error tokenizing data. c error: expected 69 fields in line 4, saw 72
i can use:
indexfile=pd.read_table("c:\edgar zip files\2001\company.idx",error_bad_lines=false)
but remove of data.
is there workaround?
ps: link sample .idf file sec edgar. download company.idx file.
your column entries have spaces in them. use 2 spaces separator.
indexfile=pd.read_table("c:\edgar zip files\2001\company.idx",sep=" ")
Comments
Post a Comment