i'm trying save dataframe hdf5 store using pandas builtin function to_hdf raises following exception:
file "c:\python\lib\site-packages\pandas\io\pytables.py", line 3433, in > >create_axes raise e typeerror: cannot serialize column [date] because data contents [datetime] object dtype
the dataframe built numpy array correct types each column
i tried convert_object() read in other frames still fails
here test code, missing data conversion cant figure out what
import numpy np import pandas pd datetime import datetime, timedelta columns = ['date', 'c1', 'c2'] # building sample test numpy array datetime, float , integer dtype = np.dtype("datetime64, f8, i2") np_data = np.empty((0, len(columns)), dtype=dtype) in range(1, 3): line = [datetime(2015, 1, 1, 12, i), i/2, i*1000] np_data = np.append(np_data, np.array([line]), axis=0) print('##### numpy array') print(np_data) # creating dataframe numpy array df = pd.dataframe(np_data, columns=columns) # trying force object conversion df.convert_objects() print('##### dataframe array') print(df) # following fails! try: df.to_hdf('store.h5', 'data', append=true) print('worked') except exception, e: print('##### error') print(e)
the code above produces following output
##### numpy array [[datetime.datetime(2015, 1, 1, 12, 1) 0 1000] [datetime.datetime(2015, 1, 1, 12, 2) 1 2000]] ##### dataframe array date c1 c2 0 2015-01-01 12:01:00 0 1000 1 2015-01-01 12:02:00 1 2000 ##### error cannot serialize column [date] because data contents [datetime] object dtype
almost pandas operations return new objects. .convert_objects()
operation discarded output.
in [20]: df2 = df.convert_objects() in [21]: df.dtypes out[21]: date object c1 object c2 object dtype: object in [22]: df2.dtypes out[22]: date datetime64[ns] c1 int64 c2 int64 dtype: object
save/restore
in [23]: df2.to_hdf('store.h5', 'data', append=true) in [25]: pd.read_hdf('store.h5','data') out[25]: date c1 c2 0 2015-01-01 12:01:00 0 1000 1 2015-01-01 12:02:00 1 2000 in [26]: pd.read_hdf('store.h5','data').dtypes out[26]: date datetime64[ns] c1 int64 c2 int64 dtype: object
finally, more idiomatic, directly construct dataframe. types inferred on construction.
in [32]: dataframe({'data' : pd.date_range('20150101',periods=2,freq='s'),'c1' : [0,1], 'c2' : [1000,2000]},columns=['data','c1','c2']).dtypes out[32]: data datetime64[ns] c1 int64 c2 int64 dtype: object
Comments
Post a Comment