5 Marks Each

Question 15 Marks

Consider the following DataFrameDf1 and answer any four questions from (i) – (iv)

	City	Hospitals	Schools
0	Delhi	189	7916
1	Mumbai	208	8508
2	Kolkata	149	7226
3	Chennai	157	7617

(a) Choose the right statement to get the given output:

(i) Df1.mean()

(ii) Df1.mean(axis=1)

(iii) Df1.average()

(iv) Df1.median()

(b) Write the command to get the given output:

	City	Hospitals	Schools
3	Chennai	157	7617
0	Delhi	189	7916
2	Kolkata	149	7226
1	Mumbai	208	8508

(i) Df1.sort(by=‘City’)

(ii) Df1.sort_values(‘City’)

(iii) Df1.sort_values(by=‘City’)

(iv) Df1.sort_values(by==‘City’)

(c) Choose the right statement to get given output:

	Hospitals	Schools
count	4,000000	4,000000
mean	175.750000	7816.750000
std	27.584718	540.543785
min	149.000000	7226.000000
25%	155.000000	7519.250000
50%	173.000000	7766.500000
75%	193.000000	8064.000000
max	208.000000	8508.000000

(i) Df1.desc()

(ii) Df1.statistics()

(iii) Df1.desctibe()

(iv) Df1.showall()

(d) Chose the right function to fill in given statement to make the city as index value:

Df1._________________(‘City’,inplace=True)

(i) Df1.set_index(‘City’,inplace=True)

(ii) Df1.index('City',inplace=True)

(iii) Df1.new_index(‘City ‘,inplace=True)

(iv) Df1.reset_index(‘City’,inplace=True)

(e) Which Pandas command is used to rename the columns & index name of the above dataframe

(i) Df1.renamecolumns()

(ii) Df1.Rename()

(iii) Df1.rename()

(iv) Df1.indexrename()

Answer

(a) (i) Df1.mean()

(b) (iii) Df1.sort_values(by=‘City’)

(c) (iii) Df1.describe()

(d) (i) Df1.set_index(‘City’,inplace=True)

(e) (iii) Df1.rename()

View full question & answer→

Question 25 Marks

Consider the following Data Frame ‘‘emp’’ and answer any four questions from (i) – (v).

Ecode	Name	Age	Fav_Color	Salary
101	Rohit	20	Blue	45000
102	Mohanti	24	Red	36000
103	Tushar Koul	23	Green	42000
104	Rupali	22	Yellow	38000
105	Gurpreet	21	Pink	40000

(a) Select the command from the given options that will give the following output:-

(i) print(emp.max)

(ii) print(emp.max(axis=1))

(iii) print(emp.max,axis=1)

(iv) print(emp.max())

(b) A manager wants to know the Favourite colour of the employee with the employee code 103. Help him to identify the correct set of statements from the given options:

(i) df1=emp[emp[‘Ecode’]==103]

print(df1)

(ii) df1=emp['Ecode'==103]

print(df1)

(iii) df1=emp[emp.Ecode=103]

print(df1)

(iv) df1=emp[emp.Ecode==103]

print(df1)

(c) Which of the following statement will give the names of the employees whose salary is more than 40000.

(i) print(emp.max())

(ii) print(emp[emp["Salary"]>40000])

(iii) print(emp["Salary"]>40000)

(iv) print(emp.max()>40000)

(d) Which of the following command will list only the columns Ename and Salary using loc:

(i) print(emp.loc[:,[0,2]]

(ii) print(emp.loc[:,["Ename","Salary"]])

(iii) print(emp.loc(:["Ename","Salary"]))

(iv) print(emp.loc[["Ename","Salary"]])

(e) Mr. Singh, the manager wants to add a new column, the Rank with the values ‘IV’, ‘II’, ‘III’, ‘IV’, ‘I’, to the data frame Help him to identify the right command from the followings to do so :

(i) emp.column=[‘IV’, ‘II’, ‘III’, ‘IV’, ‘I’ ]

(ii) emp.iloc["Rank"] =[‘IV’, ‘II’, ‘III’, ‘IV’, ‘I’]

(iii) emp["Rank"] =[‘IV’, ‘II’, ‘III’, ‘IV’, ‘I’ ]

(iv) None of the above

Answer

(a) (iv) print(emp.max())

(b) (i) df1 = emp[emp["Ecode"]==103]

View full question & answer→

Question 35 Marks

Answer the following questions:

	id	Feature1	Feature2
0	1	A	B
1	2	C	D
2	3	E	F
3	4	G	H
4	5	I	J

	id	Feature1	Feature2
0	1	K	L
1	2	M	N
2	6	O	P
3	7	Q	R
4	8	S	T

(i) To create the data frame for the above dataset.

(ii) To join the data frames.

(iii) To count the the rows in new data frame.

(iv) To reset the index.

View full question & answer→

Question 45 Marks

Answer the following question on the basis of given dataframe:

(i) To print the maximum salary of the Total Salary column.

(ii) To print the data in ascending order of Total Salary.

(iii) To print the data in descending order of Total Salary.

(iv) To print 'sum','mean','max','min','count','median','var'fro the columns 'SALARY','tax','Total Salary'.

View full question & answer→

Question 55 Marks

On the basis of given dataframe answer the following questions:

	Account	Name	Rep	Manager	Product	Quantity	Price	Status
0	714466	Tata	Saryu	Abhishek	CPU	1	30000	presented
1	714466	Tata	Saryu	Abhishek	Software	1	10000	presented
2	714466	Tata	Saryu	Abhishek	Maintenance	2	5000	pending
3	737550	Infosys	Saryu	Abhishek	CPU	1	35000	declined
4	146832	Sapient	Taneja	Abhishek	CPU	2	65000	won
5	218895	IBM	Taneja	Abhishek	CPU	2	40000	pending
6	218895	IBM	Taneja	Abhishek	Software	1	10000	presented
7	412290	Oracle	Joe	Abhishek	Maintenance	2	5000	pending
8	740150	Flipkart	Joe	Abhishek	CPU	1	35000	declined
9	141962	Byju	Charu	Arush	CPU	2	65000	won
10	163416	Gradup	Charu	Arush	CPU	1	30000	presented
11	239344	Funtoot	Charu	Arush	Maintenance	1	5000	pending
12	239344	Funtoot	Charu	Arush	Software	1	10000	presented
13	307599	SQL	Naveen	Arush	Maintenance	3	7000	won
14	688981	PiE	Naveen	Arush	CPU	5	100000	won
15	729833	Amazon	Naveen	Arush	CPU	2	65000	declined
16	729833	Amazon	Naveen	Arush	Monitor	2	5000	presented

(i) To print the complete dataframe Name wise.

(ii) To print the dataframe Name wise, Rep waise and Manage wise.

(iii) To print the data frame Manager and Rep wise.

(iv) To print the data frame Manager and Rep price wise

(v) To print the sum of the price, manager and rep wise.

(vi) To print the mean and count of the price which belong to each manager and rep.

(vii) To print the sum of the price which belong to each manager and rep.

(viii) To print the sum of the price which belong to each manager and rep along with Product belongs to them. Fill the NaN with 0.

Answer

(i) pd.pivot_table(df,index=["Name"])

(ii) pd.pivot_table(df,index=["Name","Rep","Manager"])

(iii) pd.pivot_table(df,index=["Manager","Rep"])

(iv) pd.pivot_table(df,index=["Manager","Rep"],values=["Price"])

(v) pd.pivot_table(df,index=["Manager","Rep"],values=["Price"],aggfunc=np.sum)

(vi) pd.pivot_table(df,index=["Manager","Rep"],values=["Price"],aggfunc=[np.mean,len])

(vii) pd.pivot_table(df,index=["Manager","Rep"],values=["Price"], columns=["Product"],aggfunc=[np.sum])

(viii) pd.pivot_table(df,index=["Manager","Rep"],values=["Price"],columns=["Product"],aggfunc=[np.sum],fill_value=0)

View full question & answer→

Question 65 Marks

Answer the following questions on the basis of given data set:

	index	date	duration	item	month	network	network_type
0	0	15/10/14 06:58	34.429	data	2014-11	data	data
1	1	15/10/14 06:58	13.000	call	2014-11	Vodafone	mobile
2	2	15/10/14 14:46	23.000	call	2014-11	Airtel	mobile
3	3	15/10/14 14:48	4.000	call	2014-11	data	mobile
4	4	15/10/14 17:27	4.000	call	2014-11	Airtel	mobile
5	5	15/10/14 18:55	4.000	call	2014-11	Airtel	mobile
6	6	16/10/14 06:58	34.429	call	2014-11	data	data
7	7	16/10/14 15:01	602.000	call	2014-11	Vodafone	mobile
8	8	16/10/14 15:12	1050.000	call	2014-11	Airtel	mobile
9	9	16/10/14 15:30	19.000	call	2014-11	voicemail	voicemail
10	10	16/10/14 16:21	1183.000	call	2014-11	Vodafone	mobile
11	11	16/10/14 22:18	1.000	sms	2014-11	Airtel	mobile
12	12	16/10/14 22:21	1.000	sms	2014-11	Vodafone	mobile
13	13	17/10/14 06:58	34.429	data	2014-11	data	data

(i) To count the rows in the dataset

(ii) What was the longest phone call / data entry?

(iii) How many seconds of phone calls are recorded in total?

(iv) How many entries are there for each month?

(v) To print the group key

(vi) To count the group keys

(vii) Get the first entry for each month

(viii) Get the sum of the durations per month

(ix) Get the number of dates / entries in each month

(x) What is the sum of durations, for calls only, to each network

(xi) How many calls, sms, and data entries are in each month?

(xii) How many calls, texts, and data are sent per month, split by network_type?

(xiii) Group the data frame by month and item and extract a number of stats from each group

(xiv) Group the data frame by month and item and extract a number of stats from each group

Answer

(i) data['item'].count()

(ii) data['duration'].max()

(iii) data['duration'][data['item'] == 'call'].sum()

(iv) data['month'].value_counts()

(v) data.groupby(['month']).groups.keys()

(vi) len(data.groupby(['month']).groups['2014-11'])

(vii) data.groupby('month').first()

(viii) data.groupby('month')['duration'].sum()

(ix) data.groupby('month')['date'].count()

(x) data[data['item'] == 'call'].groupby('network')['duration'].sum()

(xi) data.groupby(['month', 'item'])['date'].count()

(xii) data.groupby(['month', 'network_type'])['date'].count()

(xiii)

(xiv)

View full question & answer→

Question 75 Marks

Answer the question below based on given dataset:

DataFrame: dfzoo

animal	uniq_id	water_need
0	Elephant	1001	500
1	Elephant	1002	600
2	Elephant	1003	550
3	Tiger	1004	300
4	Tiger	1005	320
5	Tiger	1006	330
6	Tiger	1007	290
7	Tiger	1008	310
8	Zebra	1009	200
9	Zebra	1010	220
10	Zebra	1011	240
11	Zebra	1012	230
12	Zebra	1013	220
13	Zebra	1013	100
14	Zebra	1014	80
15	Lion	1015	420
16	Lion	1016	600
17	Lion	1017	500
18	Lion	1018	390
19	Kangaroo	1019	410
20	Kangaroo	1020	430
21	Kangaroo	1021	410