Questions

5 Marks Each

🎯

Test yourself on this topic

7 questions · timed · auto-graded

Question 15 Marks
Consider the following DataFrameDf1 and answer any four questions from (i) – (iv)

City Hospitals Schools
0 Delhi 189 7916
1 Mumbai 208 8508
2 Kolkata 149 7226
3 Chennai 157 7617

(a) Choose the right statement to get the given output:

(i) Df1.mean()

(ii) Df1.mean(axis=1)

(iii) Df1.average()

(iv) Df1.median()

(b) Write the command to get the given output:

City Hospitals Schools
3 Chennai 157 7617
0 Delhi 189 7916
2 Kolkata 149 7226
1 Mumbai 208 8508

(i) Df1.sort(by=‘City’)

(ii) Df1.sort_values(‘City’)

(iii) Df1.sort_values(by=‘City’)

(iv) Df1.sort_values(by==‘City’)

(c) Choose the right statement to get given output:

Hospitals Schools
count 4,000000 4,000000
mean 175.750000 7816.750000
std 27.584718 540.543785
min 149.000000 7226.000000
25% 155.000000 7519.250000
50% 173.000000 7766.500000
75% 193.000000 8064.000000
max 208.000000 8508.000000

(i) Df1.desc()

(ii) Df1.statistics()

(iii) Df1.desctibe()

(iv) Df1.showall()

(d) Chose the right function to fill in given statement to make the city as index value:

Df1._________________(‘City’,inplace=True)

(i) Df1.set_index(‘City’,inplace=True)

(ii) Df1.index('City',inplace=True)

(iii) Df1.new_index(‘City ‘,inplace=True)

(iv) Df1.reset_index(‘City’,inplace=True)

(e) Which Pandas command is used to rename the columns & index name of the above dataframe

(i) Df1.renamecolumns()

(ii) Df1.Rename()

(iii) Df1.rename()

(iv) Df1.indexrename()

Answer
(a) (i) Df1.mean()

(b) (iii) Df1.sort_values(by=‘City’)

(c) (iii) Df1.describe()

(d) (i) Df1.set_index(‘City’,inplace=True)

(e) (iii) Df1.rename()

View full question & answer
Question 25 Marks
Consider the following Data Frame ‘‘emp’’ and answer any four questions from (i) – (v).

Ecode Name Age Fav_Color Salary
101 Rohit 20 Blue 45000
102 Mohanti 24 Red 36000
103 Tushar Koul 23 Green 42000
104 Rupali 22 Yellow 38000
105 Gurpreet 21 Pink 40000

(a) Select the command from the given options that will give the following output:-

(i) print(emp.max)

(ii) print(emp.max(axis=1))

(iii) print(emp.max,axis=1)

(iv) print(emp.max())

(b) A manager wants to know the Favourite colour of the employee with the employee code 103. Help him to identify the correct set of statements from the given options:

(i) df1=emp[emp[‘Ecode’]==103]

print(df1)

(ii) df1=emp['Ecode'==103]

print(df1)

(iii) df1=emp[emp.Ecode=103]

print(df1)

(iv) df1=emp[emp.Ecode==103]

print(df1)

(c) Which of the following statement will give the names of the employees whose salary is more than 40000.

(i) print(emp.max())

(ii) print(emp[emp["Salary"]>40000])

(iii) print(emp["Salary"]>40000)

(iv) print(emp.max()>40000)

(d) Which of the following command will list only the columns Ename and Salary using loc:

(i) print(emp.loc[:,[0,2]]

(ii) print(emp.loc[:,["Ename","Salary"]])

(iii) print(emp.loc(:["Ename","Salary"]))

(iv) print(emp.loc[["Ename","Salary"]])

(e) Mr. Singh, the manager wants to add a new column, the Rank with the values ‘IV’, ‘II’, ‘III’, ‘IV’, ‘I’, to the data frame Help him to identify the right command from the followings to do so :

(i) emp.column=[‘IV’, ‘II’, ‘III’, ‘IV’, ‘I’ ]

(ii) emp.iloc["Rank"] =[‘IV’, ‘II’, ‘III’, ‘IV’, ‘I’]

(iii) emp["Rank"] =[‘IV’, ‘II’, ‘III’, ‘IV’, ‘I’ ]

(iv) None of the above

Answer
(a) (iv) print(emp.max())

(b) (i) df1 = emp[emp["Ecode"]==103]

View full question & answer
Question 35 Marks

Answer the following questions:

id Feature1 Feature2
0 1 A B
1 2 C D
2 3 E F
3 4 G H
4 5 I J


id Feature1 Feature2
0 1 K L
1 2 M N
2 6 O P
3 7 Q R
4 8 S T

(i) To create the data frame for the above dataset.

(ii) To join the data frames.

(iii) To count the the rows in new data frame.

(iv) To reset the index.

View full question & answer
Question 45 Marks
Answer the following question on the basis of given dataframe:

(i) To print the maximum salary of the Total Salary column.

(ii) To print the data in ascending order of Total Salary.

(iii) To print the data in descending order of Total Salary.

(iv) To print 'sum','mean','max','min','count','median','var'fro the columns 'SALARY','tax','Total Salary'.

View full question & answer
Question 55 Marks
On the basis of given dataframe answer the following questions:

Account Name Rep Manager Product Quantity Price Status
0 714466 Tata Saryu Abhishek CPU 1 30000 presented
1 714466 Tata Saryu Abhishek Software 1 10000 presented
2 714466 Tata Saryu Abhishek Maintenance 2 5000 pending
3 737550 Infosys Saryu Abhishek CPU 1 35000 declined
4 146832 Sapient Taneja Abhishek CPU 2 65000 won
5 218895 IBM Taneja Abhishek CPU 2 40000 pending
6 218895 IBM Taneja Abhishek Software 1 10000 presented
7 412290 Oracle Joe Abhishek Maintenance 2 5000 pending
8 740150 Flipkart Joe Abhishek CPU 1 35000 declined
9 141962 Byju Charu Arush CPU 2 65000 won
10 163416 Gradup Charu Arush CPU 1 30000 presented
11 239344 Funtoot Charu Arush Maintenance 1 5000 pending
12 239344 Funtoot Charu Arush Software 1 10000 presented
13 307599 SQL Naveen Arush Maintenance 3 7000 won
14 688981 PiE Naveen Arush CPU 5 100000 won
15 729833 Amazon Naveen Arush CPU 2 65000 declined
16 729833 Amazon Naveen Arush Monitor 2 5000 presented

(i) To print the complete dataframe Name wise.

(ii) To print the dataframe Name wise, Rep waise and Manage wise.

(iii) To print the data frame Manager and Rep wise.

(iv) To print the data frame Manager and Rep price wise

(v) To print the sum of the price, manager and rep wise.

(vi) To print the mean and count of the price which belong to each manager and rep.

(vii) To print the sum of the price which belong to each manager and rep.

(viii) To print the sum of the price which belong to each manager and rep along with Product belongs to them. Fill the NaN with 0.

Answer
(i) pd.pivot_table(df,index=["Name"])

(ii) pd.pivot_table(df,index=["Name","Rep","Manager"])

(iii) pd.pivot_table(df,index=["Manager","Rep"])

(iv) pd.pivot_table(df,index=["Manager","Rep"],values=["Price"])

(v) pd.pivot_table(df,index=["Manager","Rep"],values=["Price"],aggfunc=np.sum)

(vi) pd.pivot_table(df,index=["Manager","Rep"],values=["Price"],aggfunc=[np.mean,len])

(vii) pd.pivot_table(df,index=["Manager","Rep"],values=["Price"], columns=["Product"],aggfunc=[np.sum])

(viii) pd.pivot_table(df,index=["Manager","Rep"],values=["Price"],columns=["Product"],aggfunc=[np.sum],fill_value=0)

View full question & answer
Question 65 Marks
Answer the following questions on the basis of given data set:

index date duration item month network network_type
0 0 15/10/14 06:58 34.429 data 2014-11 data data
1 1 15/10/14 06:58 13.000 call 2014-11 Vodafone mobile
2 2 15/10/14 14:46 23.000 call 2014-11 Airtel mobile
3 3 15/10/14 14:48 4.000 call 2014-11 data mobile
4 4 15/10/14 17:27 4.000 call 2014-11 Airtel mobile
5 5 15/10/14 18:55 4.000 call 2014-11 Airtel mobile
6 6 16/10/14 06:58 34.429 call 2014-11 data data
7 7 16/10/14 15:01 602.000 call 2014-11 Vodafone mobile
8 8 16/10/14 15:12 1050.000 call 2014-11 Airtel mobile
9 9 16/10/14 15:30 19.000 call 2014-11 voicemail voicemail
10 10 16/10/14 16:21 1183.000 call 2014-11 Vodafone mobile
11 11 16/10/14 22:18 1.000 sms 2014-11 Airtel mobile
12 12 16/10/14 22:21 1.000 sms 2014-11 Vodafone mobile
13 13 17/10/14 06:58 34.429 data 2014-11 data data

(i) To count the rows in the dataset

(ii) What was the longest phone call / data entry?

(iii) How many seconds of phone calls are recorded in total?

(iv) How many entries are there for each month?

(v) To print the group key

(vi) To count the group keys

(vii) Get the first entry for each month

(viii) Get the sum of the durations per month

(ix) Get the number of dates / entries in each month

(x) What is the sum of durations, for calls only, to each network

(xi) How many calls, sms, and data entries are in each month?

(xii) How many calls, texts, and data are sent per month, split by network_type?

(xiii) Group the data frame by month and item and extract a number of stats from each group

(xiv) Group the data frame by month and item and extract a number of stats from each group

Answer
(i) data['item'].count()

(ii) data['duration'].max()

(iii) data['duration'][data['item'] == 'call'].sum()

(iv) data['month'].value_counts()

(v) data.groupby(['month']).groups.keys()

(vi) len(data.groupby(['month']).groups['2014-11'])

(vii) data.groupby('month').first()

(viii) data.groupby('month')['duration'].sum()

(ix) data.groupby('month')['date'].count()

(x) data[data['item'] == 'call'].groupby('network')['duration'].sum()

(xi) data.groupby(['month', 'item'])['date'].count()

(xii) data.groupby(['month', 'network_type'])['date'].count()

(xiii)

(xiv)

View full question & answer
Question 75 Marks
Answer the question below based on given dataset:

DataFrame: dfzoo

animal uniq_id water_need
0 Elephant 1001 500
1 Elephant 1002 600
2 Elephant 1003 550
3 Tiger 1004 300
4 Tiger 1005 320
5 Tiger 1006 330
6 Tiger 1007 290
7 Tiger 1008 310
8 Zebra 1009 200
9 Zebra 1010 220
10 Zebra 1011 240
11 Zebra 1012 230
12 Zebra 1013 220
13 Zebra 1013 100
14 Zebra 1014 80
15 Lion 1015 420
16 Lion 1016 600
17 Lion 1017 500
18 Lion 1018 390
19 Kangaroo 1019 410
20 Kangaroo 1020 430
21 Kangaroo 1021 410

(i) Counting all the animals

(ii) Count the number of animals in zoo

(iii) To print the sum of the water need of an animals.

(iv) To print the sum of all values.

(v) To print the sum of only numeric values.

(vi) To print the minimum values of water need.

(vii) To print the mean values of water need.

(viii) To print the median values of water need.

(ix) To print the animal wise means value.

(x) To print the mean value water need of an each animal.

Answer
(i)

(ii)

(iii)

(iv)

(v)

(vi)

(vii)

(viii)

(ix)

(x)

View full question & answer