Convert Data Frame To Another Data Frame, Split Compound String Cell Into Individual Rows
I am looking to convert data frame df1 to df2 using Python. I have a solution that uses loops but I am wondering if there is an easier way to create df2. df1 Test1 Test2 20
Solution 1:
Here's one way using numpy.repeat
and itertools.chain
:
import numpy as np
from itertools import chain
# split by delimiter and calculate length for each row
split = df['Test2'].str.split(':')
lens = split.map(len)
# repeat non-split columns
cols = ('Test1', '2014', '2015', '2016', 'Present')
d1 = {col: np.repeat(df[col], lens) for col in cols}
# chain split columns
d2 = {'Test2': list(chain.from_iterable(split))}
# combine in a single dataframe
res = pd.DataFrame({**d1, **d2})
print(res)
201420152016 Present Test1 Test2
19085840 x a
28879721 x a
28879721 x b
37576810 y a
37576810 y b
37576810 y c
46062660 y b
56862661 y c
Solution 2:
This will achieve what you want:
# Converting "Test2" strings into lists ofvalues
df["Test2"] = df["Test2"].apply(lambda x: x.split(":"))
# Creating second dataframe with "Test2" values
test2 = df.apply(lambda x: pd.Series(x['Test2']),axis=1).stack().reset_index(level=1, drop=True)
test2.name ='Test2'
# Joining both dataframes
df = df.drop('Test2', axis=1).join(test2)
print(df)
Test1 201420152016 Present Test2
1 x 9085840a2 x 8879721a2 x 8879721b3 y 7576810a3 y 7576810b3 y 7576810 c
4 y 6062660b5 y 6862661 c
Post a Comment for "Convert Data Frame To Another Data Frame, Split Compound String Cell Into Individual Rows"