pytorch_day01

写在一开始

被同学拉着一起学，那就正式学习一下pytorch. 择日不如撞日.
* 代码部分放在colab上，这里总结一点主要/延申知识

Day 1 Structured Data Modeling Example

Using Titanic dataset. The goal is to predict whether a passenger is surived.
The dataset contains 10 features,within them:
- 4 valued feature
- 4 categorical feature
- 2 other feauture(ticket number & name)
Among the features, some of them has missing values
The tutorial then do the data preprocessing, building a MLP with one hidden layer, and write the training function.
About the Optimizer,loss function and evaluaiont metrics. plz see the summarization notes.
Pre-processing:
- Some meaningless features are directly dropped
- Categorical features are encoded by one-hot
- Missing value are used as an assistant feature

Some functions used in the program

1	pd.get_dummies(Series,DataFrame)

Convert categorical feature to one-hot encoding.
Para:
- columns: when input is DataFrame,
- add prefix and
- treat NaN as a new class for extra col
- drop__first = True: drop the first class(used in linear model)
- dtype: define the datatype
  1
  2
  pd.isna()
  // alias of pd.isnull()
detect missing value, return a list wiht the same length of original object.True if missing value detected, false if not.

Training code

It depends on different coding style. Here is one frame:

def train(x_train,y_train,epochs,loss_func,):
    # forward process
    
    for epoch in range(0,epochs):
        model.train()
        total_loss,step = 0,0
        loop = tqdm(enumerate(x_train), total =len(x_train),file = sys.stdout) # tqdm 将iterated object 包装成带进度条的iterator 说人话就是可视化训练进度
        for i,batch  in loop:
            x,y = batch
            pred_y = net(x)
            loss = loss_func(preds,y)

            # backward process
            # The below three lines are always here.
            loss.backward()
            optimizer.step()
            optimizer.zero_grad()

            total_loss += loss.item()
            # Can add code for log/tracking below

Evalution code

def eval():
    total_loss,step = 0,0
    loop = tqdm(enumerate(dl_val), total =len(dl_val),file = sys.stdout)
    
    val_metrics_dict = deepcopy(metrics_dict) 
    
    with torch.no_grad():
        for i, batch in loop: 
        
            features,labels = batch
            
            #forward
            preds = net(features)
            loss = loss_fn(preds,labels)

            #metrics
            step_metrics = {"val_"+name:metric_fn(preds, labels).item() 
                            for name,metric_fn in val_metrics_dict.items()}


            total_loss += loss.item()
            step+=1
            if i!=len(dl_val)-1:
                loop.set_postfix(**step_log)
            else:
                epoch_loss = (total_loss/step)
                epoch_metrics = {"val_"+name:metric_fn.compute().item() 
                                 for name,metric_fn in val_metrics_dict.items()}
                epoch_log = dict({"val_loss":epoch_loss},**epoch_metrics)
                loop.set_postfix(**epoch_log)

                for name,metric_fn in val_metrics_dict.items():
                    metric_fn.reset()
                    
    epoch_log["epoch"] = epoch           
    for name, metric in epoch_log.items():
        history[name] = history.get(name, []) + [metric]

Save the model

# Save params only

torch.save(net.state_dict(), "./data/net_parameter.pt")

net_clone = create_net()
net_clone.load_state_dict(torch.load("./data/net_parameter.pt",weights_only=True))

torch.sigmoid(net_clone.forward(torch.tensor(x_test[0:10]).float())).data

# Save the whole model
torch.save(net, './data/net_model.pt')
net_loaded = torch.load('./data/net_model.pt',weights_only=False)
torch.sigmoid(net_loaded(torch.tensor(x_test[0:10]).float())).data

写在一开始

被同学拉着一起学，那就正式学习一下pytorch. 择日不如撞日.* 代码部分放在colab上，这里总结一点主要/延申知识

Day 1 Structured Data Modeling Example

Some functions used in the program

Training code

Evalution code

Save the model

被同学拉着一起学，那就正式学习一下pytorch. 择日不如撞日.
* 代码部分放在colab上，这里总结一点主要/延申知识