EADST

Train XGBoost Model with Pandas Input

Train XGBoost Model with Pandas Input

import warnings
warnings.filterwarnings("ignore")
import pandas as pd
import numpy as np
import xgboost as xgb
from sklearn.metrics import classification_report

train=pd.read_csv('./train.csv')
test=pd.read_csv('./test.csv')


info=pd.read_csv('info.csv')
print(info.head()) # column name
print(info.shape)
new_info = info.drop_duplicates(subset=['id']) # remove duplicate row with same id
train2=pd.merge(train, new_info[['id', 'number']], how='left', on='id').fillna(0) # merge table horizontally

train_y=train2['result']
train_x=train2.drop(columns=['uaid','result','others'])
test_id = test['id']
test_y=test['result']
test_x=test.drop(columns=['uaid','result','others'])


model = xgb.XGBClassifier()
model.fit(train_x, train_y)
train_predict_y = model.predict(train_x)
print(classification_report(train_y, train_predict_y))


result=model.predict_proba(test_x)
result=pd.concat([test_y,pd.DataFrame(result)],axis=1)
result.to_csv('./test_result.csv')
相关标签
About Me
XD
Goals determine what you are going to be.
Category
标签云
Nginx Jupyter Bert 图标 uwsgi Paddle Git Plotly Website NameSilo UI MD5 Ubuntu Streamlit SVR GoogLeNet C++ 第一性原理 Clash Base64 Pickle Diagram Quantization News Color Qwen2.5 腾讯云 ChatGPT Review TensorRT CUDA OCR Python GPTQ Shortcut GPT4 PIP Translation Plate YOLO Safetensors TTS Hotel 云服务器 ONNX Pytorch 签证 tqdm PDB CTC DeepSeek HuggingFace tar WAN diffusers NLTK EXCEL Algorithm hf XGBoost 递归学习法 Ptyhon Permission Vim TSV Llama Pillow Sklearn Qwen2 GGML Statistics uWSGI API Mixtral UNIX WebCrawler 继承 NLP transformers Search Domain Magnet 音频 证件照 Freesound Vmess Paper Tiktoken SPIE printf Hilton icon IndexTTS2 llama.cpp Proxy 报税 FP8 LoRA Excel 关于博主 v0.dev Heatmap CEIR 版权 InvalidArgumentError Transformers PyCharm HaggingFace FP16 Tensor Quantize Augmentation Data FP64 Anaconda git-lfs RAR Miniforge PDF 域名 顶会 Food Input BeautifulSoup 图形思考法 Datetime LaTeX 飞书 净利润 Use BTC Animate LeetCode LLAMA Claude QWEN OpenCV logger torchinfo 多进程 Google Card Video JSON XML FP32 ResNet-50 CLAP v2ray 公式 Math GIT Michelin Breakpoint Disk Linux Gemma 财报 OpenAI CC Pandas Password Agent CSV Conda Numpy VSCode LLM Random Interview git Template CAM Markdown Github Bitcoin SAM Docker Windows FastAPI AI Logo Firewall Jetson Cloudreve TensorFlow Tracking 阿里云 FlashAttention Web CV Image2Text Qwen Land VGG-16 多线程 Knowledge mmap 算法题 搞笑 Baidu Crawler SQLite Zip PyTorch Django RGB ModelScope COCO SQL DeepStream BF16 Dataset Bipartite 强化学习 Hungarian scipy Distillation Attention VPN Bin
站点统计

本站现有博文322篇,共被浏览786583

本站已经建立2481天!

热门文章
文章归档
回到顶部