python - Error with datetime column while serialising a DataFrame into an HDF5 store -


i'm trying save dataframe hdf5 store using pandas builtin function to_hdf raises following exception:

file "c:\python\lib\site-packages\pandas\io\pytables.py", line 3433, in > >create_axes raise e typeerror: cannot serialize column [date] because data contents [datetime] object dtype

the dataframe built numpy array correct types each column

i tried convert_object() read in other frames still fails

here test code, missing data conversion cant figure out what

import numpy np import pandas pd datetime import datetime, timedelta  columns = ['date', 'c1', 'c2']  # building sample test numpy array datetime, float , integer dtype = np.dtype("datetime64, f8, i2") np_data = np.empty((0, len(columns)), dtype=dtype) in range(1, 3):     line = [datetime(2015, 1, 1, 12, i), i/2, i*1000]     np_data = np.append(np_data, np.array([line]), axis=0) print('##### numpy array') print(np_data)  # creating dataframe numpy array df = pd.dataframe(np_data, columns=columns) # trying force object conversion df.convert_objects() print('##### dataframe array') print(df)  # following fails! try:     df.to_hdf('store.h5', 'data', append=true)     print('worked') except exception, e:     print('##### error')     print(e) 

the code above produces following output

##### numpy array [[datetime.datetime(2015, 1, 1, 12, 1) 0 1000]  [datetime.datetime(2015, 1, 1, 12, 2) 1 2000]] ##### dataframe array                   date c1    c2 0  2015-01-01 12:01:00  0  1000 1  2015-01-01 12:02:00  1  2000 ##### error cannot serialize column [date] because data contents [datetime] object dtype 

almost pandas operations return new objects. .convert_objects() operation discarded output.

in [20]: df2 = df.convert_objects()  in [21]: df.dtypes out[21]:  date    object c1      object c2      object dtype: object  in [22]: df2.dtypes out[22]:  date    datetime64[ns] c1               int64 c2               int64 dtype: object 

save/restore

in [23]: df2.to_hdf('store.h5', 'data', append=true)  in [25]: pd.read_hdf('store.h5','data') out[25]:                   date  c1    c2 0 2015-01-01 12:01:00   0  1000 1 2015-01-01 12:02:00   1  2000  in [26]: pd.read_hdf('store.h5','data').dtypes out[26]:  date    datetime64[ns] c1               int64 c2               int64 dtype: object 

finally, more idiomatic, directly construct dataframe. types inferred on construction.

in [32]: dataframe({'data' : pd.date_range('20150101',periods=2,freq='s'),'c1' : [0,1], 'c2' : [1000,2000]},columns=['data','c1','c2']).dtypes out[32]:  data    datetime64[ns] c1               int64 c2               int64 dtype: object 

Comments