python - read_table error while reading .idx file -


i'm trying read .idx file 1.89gb in size. if write:

indexfile=pd.read_table("c:\edgar zip files\2001\company.idx")

i output as:

  • company name form type cik date filed file name
  • 0 033 asset management llc / ...
  • 1 033 asset management llc / ...
  • 2 1 800 contacts inc ...
  • 3 1 800 contacts inc ...
  • 4 1 800 flowers com inc ...

where columns merged in single column


if do:

indexfile=pd.read_table("c:\edgar zip files\2001\company.idx",sep=" ")

i error:

cparsererror: error tokenizing data. c error: expected 69 fields in line 4, saw 72


i can use:

indexfile=pd.read_table("c:\edgar zip files\2001\company.idx",error_bad_lines=false)

but remove of data.

is there workaround?

ps: link sample .idf file sec edgar. download company.idx file.

your column entries have spaces in them. use 2 spaces separator.

indexfile=pd.read_table("c:\edgar zip files\2001\company.idx",sep="  ") 

Comments

Popular posts from this blog

c# - Better 64-bit byte array hash -

webrtc - Which ICE candidate am I using and why? -

php - Zend Framework / Skeleton-Application / Composer install issue -