EADST

Train XGBoost Model with Pandas Input

Train XGBoost Model with Pandas Input

import warnings
warnings.filterwarnings("ignore")
import pandas as pd
import numpy as np
import xgboost as xgb
from sklearn.metrics import classification_report

train=pd.read_csv('./train.csv')
test=pd.read_csv('./test.csv')


info=pd.read_csv('info.csv')
print(info.head()) # column name
print(info.shape)
new_info = info.drop_duplicates(subset=['id']) # remove duplicate row with same id
train2=pd.merge(train, new_info[['id', 'number']], how='left', on='id').fillna(0) # merge table horizontally

train_y=train2['result']
train_x=train2.drop(columns=['uaid','result','others'])
test_id = test['id']
test_y=test['result']
test_x=test.drop(columns=['uaid','result','others'])


model = xgb.XGBClassifier()
model.fit(train_x, train_y)
train_predict_y = model.predict(train_x)
print(classification_report(train_y, train_predict_y))


result=model.predict_proba(test_x)
result=pd.concat([test_y,pd.DataFrame(result)],axis=1)
result.to_csv('./test_result.csv')
相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
腾讯云 DeepStream Agent Pickle Tracking Card PyCharm WAN Knowledge Tiktoken Plotly 飞书 Translation v0.dev WebCrawler Transformers Review Base64 OpenCV Template Video Zip Jetson Land uwsgi Image2Text 净利润 UI Attention Vim Datetime CUDA UNIX SPIE ChatGPT CC FP16 阿里云 scipy Conda Qwen Quantize diffusers RAR Heatmap PDB 域名 git-lfs DeepSeek SAM Bitcoin TTS 财报 GoogLeNet Data logger Linux Dataset Python Diagram Color JSON Markdown ModelScope Anaconda mmap GIT Clash Pillow QWEN CSV Sklearn LLAMA Crawler Qwen2 音频 OpenAI 证件照 多线程 TensorFlow Logo PyTorch Food Distillation Disk 报税 VSCode GPT4 Miniforge LeetCode Ubuntu CAM AI 继承 Bipartite Shortcut Safetensors Michelin Numpy LaTeX Mixtral 签证 MD5 FlashAttention Random 公式 NLP Quantization SVR 关于博主 TSV Baidu v2ray LoRA hf Freesound Proxy BF16 EXCEL Google Firewall CLAP Statistics tar Windows InvalidArgumentError Website Password Pandas uWSGI printf YOLO transformers ResNet-50 tqdm Jupyter Paper GGML Bin COCO Math Use Bert PDF PIP RGB XGBoost HaggingFace Excel API NameSilo Domain Qwen2.5 git Claude ONNX C++ VPN FP32 Breakpoint Nginx XML Github Hotel Magnet 算法题 Plate Git Docker Cloudreve CTC IndexTTS2 Augmentation CV VGG-16 CEIR OCR llama.cpp GPTQ BTC Ptyhon NLTK Input HuggingFace Pytorch SQLite LLM 搞笑 Django FastAPI Hungarian Permission FP8 Tensor Algorithm Streamlit Llama Paddle Web Interview Vmess FP64 BeautifulSoup 版权 Hilton Animate 多进程 Gemma SQL torchinfo TensorRT
站点统计

本站现有博文312篇,共被浏览745417

本站已经建立2390天!

热门文章
文章归档
回到顶部