Re: Untitled

From Ivory Duck, 11 Months ago, written in Plain Text, viewed 188 times. This paste is a reply to Untitled from Little Finch - view diff
URL http://codebin.org/view/51071481 Embed
Download Paste or View Raw
  1. import pandas as pd
  2. from sklearn.model_selection import train_test_split
  3. from sklearn.preprocessing import StandardScaler
  4.  
  5. data = pd.read_csv('/datasets/flights.csv')
  6.  
  7. # < преобразуйте данные так, чтобы избежать дамми-ловушки >
  8. data_ohe = pd.get_dummies(data, drop_first=True)
  9.  
  10. # < поделим данные >
  11. features = data_ohe.drop(['Arrival Delay'] , axis=1)
  12. target = data_ohe['Arrival Delay']
  13.  
  14. features_train, features_valid, target_train, target_valid = train_test_split(
  15.     features, target, test_size=0.25, random_state=12345)
  16.  
  17. numeric = ['Day', 'Day Of Week', 'Origin Airport Delay Rate',
  18.        'Destination Airport Delay Rate', 'Scheduled Time', 'Distance',
  19.        'Scheduled Departure Hour', 'Scheduled Departure Minute']
  20.  
  21. scaler = StandardScaler()
  22. scaler.fit(features_train[numeric])
  23.  
  24. # < преобразуйте тренировочную выборку >
  25. features_train[numeric] = scaler.transform(features_train[numeric])
  26.  
  27. # < преобразуйте валидационную выборку >
  28. features_valid[numeric] = scaler.transform(features_valid[numeric])
  29.  
  30. print(features_train.shape)
  31. print(features_valid.shape)

Reply to "Re: Untitled"

Here you can reply to the paste above