습관처럼

Kaggle - Titanic(3) 본문

Kaggle

Kaggle - Titanic(3)

dev.wookii 2020. 3. 2. 16:21

이번에는 Titanic에서 Feature Engineering을 해보도록 하겠습니다~

4.Feature Engineering

Feature engineering is the process of using domain knowledge of the data to create features (feature vectors) that make machine learning algorithms work. 

feature vector is an n-dimensional vector of numerical features that represent some object.
Many algorithms in machine learning require a numerical representation of objects, since such representations facilitate processing and statistical analysis.

 

 >> 머신러닝을 작동시키는 feature vector를 생성하자~

>>특징 벡터는 일부 객체를 숫자 특징으로 표현한 n 차원 벡터이며, 이는 처리 및 통계 분석을 용이하게하기 때문입니다.

 

4.1 how titanic sank?

sank from the bow of the ship where third class rooms located. conclusion, Pclass is key feature for classifier

4.2 Name

다음처럼 Mr, Miss, Mrs, Others로 4개로 나뉘어 feature vector를 만들어 비교를 설정합니다. 이를 위해서 모든 이름을 분류한 것입니다.

4.3 Sex

4.4 Age

4.4.1 some age is missing. Let's use Title's median age for missing Age

위처럼 나이에 따른 분포를 잘 볼수 있습니다.~^0^

 

Kaggle : https://www.kaggle.com/c/titanic

출처 : https://www.youtube.com/watch?v=FAP7JOECfEE

'Kaggle' 카테고리의 다른 글

Kaggle - Titanic(4)  (0) 2020.04.01
Kaggle - Titanic(2)  (0) 2020.03.02
Kaggle - Titanic(1)  (0) 2020.03.02