Hi all, When you get the data in CSV format or any other format then you should try this 10 basic command to understand more the data. How much you understand the dataset will help you to make better model for it. Before jumping into the code I want to mention some basic packages/software you want to have: Python Jupyter Notebook (Highly recommend this one) or Spyder pandas (default it will be installed in python) That's it. These are the basic things you should need to start. First get some the data from kaggle or anywhere.Let the data format will be CSV format for better understanding. Import the necessary packages and data like this import pandas as pd df= pd.read_csv("path to csv file) Now Lets the establish top common commands: 1. df.head(): This command will show top rows in the dataset 2. df.info(): This command will tell the type of each column such as float, int, object type. 3. df. describe(): This will describe the column in the form as
This is the first step for Data scientist work. Data is very important in this field without data there is nothing to do here. I mean data is everything, not only number it consists of image, audio, video or anything. In this post, I will tell the process to load simple dataset in CSV format. I recommend installing jupyter or spyder from Anaconda website. Packages you want to install in cmd or terminal pip install pandas pip install numpy pip install matplotlib Mostly these packages will be installed default. Just for a verification check whether it is installed. Numpy It is a mathematical tool used to do some complex calculation. Pandas Without pandas, we can import the data. It is manually used to form a data frame and other allocation processes. Matplotlib It is one of the visualization packages in python. It is used to visualize data. I write the standard code below to import CSV file import