Prepare Data for Exploration

1.

Question 1

If you have a short time frame for data collection and need an answer immediately, you likely will have to use historical data.

  • True
  • False

2.

Question 2

Which of the following is an example of continuous data?

  • Leading actors in movie
  • Box office returns
  • Movie budget
  • Movie run time

3.

Question 3

Fill in the blank: The question “Where did you vacation last year?” is an example of collecting _____ data.

  • nominal quantitative
  • real quantitative
  • real qualitative
  • nominal qualitative

4.

Question 4

Internal data is more reliable because it’s shared publicly.

  • False
  • True

5.

Question 5

Structured data is likely to be found in which of the following formats? Select all that apply.

  • Spreadsheet
  • Audio file
  • Digital photo
  • Table

6.

Question 6

Fill in the blank: A Boolean data type can have _____ possible values.

  • 10
  • three
  • infinite
  • two

7.

Question 7

What do the columns contain in long data?

  • Different formats
  • The data types
  • The values and the context for the values
  • Specific constraints

8.

Question 8

Data transformation can change the structure of the data. An example of this is taking data stored in one format and converting it to another.

  • True
  • False

*Weekly challenge 2*

1.

Question 1

A clinic surveys a group of male and female patients about their experience with physical therapy. The survey does not include people with disabilities. Is the survey data biased?

  • Yes
  • No

2.

Question 2

A researcher believes that playing music to plants will increase flower production. They set up an experiment and collect the data. Although the findings are inconclusive, they choose to interpret the data in a way that supports their desired result. What type of bias does this represent?

  • Interpretation
  • Sampling
  • Observer
  • Confirmation

3.

Question 3

A data analyst reviews a dataset. They conclude that the data is inaccurate and incomplete in some places. They also confirm that the data is biased. What type of data does this describe?

  • Open data
  • Unreliable data
  • Good data
  • Informed data

4.

Question 4

In data ethics, what gives an individual the right to know why their data is collected and how it will be used?

  • Privacy
  • Credibility
  • Consent
  • Anonymization

5.

Question 5

Fill in the blank: In data ethics, the individual who originally generates the data is the person who _____ the data.

  • transforms
  • processes
  • deletes
  • owns

6.

Question 6

An employer accesses an employee’s credit report without their consent. This is not a violation of the employee’s privacy because they work at the company.

  • True
  • False

7.

Question 7

Fill in the blank: A company wants to protect its users’ private and sensitive data. One way to do this is to use _____ to remove any identifying information.

  • a data ethics review
  • a firewall
  • a data anonymizer
  • a screen

8.

Question 8

A key aspect of open data is free access to people’s personal information.

  • True
  • False

*Weekly challenge 3*

1.

Question 1

Which of the following properties describe primary keys in a relational database? Select all that apply.

  • They are used to ensure data in a specific column is unique.
  • They refer to another primary key in a different table.
  • There can be multiple primary keys in a table.
  • There can only be one primary key in a table.

2.

Question 2

What is a tool used to store metadata so analysts can ensure that data is consistent and reliable?

  • Metadata repository
  • Metadata server
  • Text files
  • Relational database

3.

Question 3

Structural metadata indicates how a piece of data is organized and whether it’s part of one or more than one data collection.

  • True
  • False

4.

Question 4

Fill in the blank: Data _____ is the process of ensuring the formal management of a company’s data assets.

  • governance
  • integrity
  • aggregation
  • mapping

5.

Question 5

In what circumstance might a data analyst choose not to use external data in their analysis?

  • The data is too thorough.
  • The data cannot be confirmed to be reliable.
  • The data is free for anyone to access.
  • The data represents diverse perspectives.

6.

Question 6

A data analyst reviews a database of Wisconsin car sales to find the last five car models sold in Milwaukee in 2019. How can they sort and filter the data to return the last five cars sold at the top of their list? Select all that apply.

  • Sort by sale date in ascending order
  • Filter out sales outside of Milwaukee
  • Filter out sales not in 2019
  • Sort by sale date in descending order

7.

Question 7

When writing a query, it’s necessary for the name of the dataset to be inside two backticks in order for the query to run properly.

  • True
  • False

8.

Question 8

You are working with a database table that contains customer data. The first_name column lists the first name of each customer. You are only interested in customers with the first name Mark.

You write the SQL query below.

SELECT * FROM Customer

What code would be added to return only customers named Mark?

  • IN first_name = ‘Mark’
  • JOIN first_name = ‘Mark’
  • first_name = ‘Mark’
  • WHERE first_name = ‘Mark’

*Weekly challenge 4*

1.

Question 1

A data analytics team labels its files to indicate their content, creation date, and version number. The team is using what data organization tool?

  • File-naming attributes
  • File-naming verifications
  • File-naming references
  • File-naming conventions

2.

Question 2

Your boss assigns you a new multi-phase project and you create a naming convention for all of your files. With this project lasting years and incorporating multiple analysts it’s crucial that you create data explaining how your naming conventions are structured. What is this data called?

  • Labeled data
  • Descriptive data
  • Named convention
  • Metadata

3.

Question 3

Which of the following are examples of effective file names? Select all that apply.

  • NewCustomerSurvey-2020-6-20-V03
  • NewCustomerSurvey_2020-6-20_V03
  • NewCustomerSurvey_2020_6_20_V03
  • NewCustomerSurvey 2020-6-20 V03

4.

Question 4

Data analysts use a process called encryption to organize folders into subfolders.

  • True
  • False

5.

Question 5

Fill in the blank: A data analyst has completed their project and has no use of the remaining files. They decide to _____ them in the company’s central server.

  • archive
  • duplicate
  • delete
  • push

6.

Question 6

A data analyst working for a pet supply company has just started a new project. They organize their project folders to be broad topics at the top with more specific topics in the subfolders. This folder structure is known as what?

  • Bottom to top approach
  • Hierarchical approach
  • Broad approach
  • Top to bottom approach

7.

Question 7

Using encryption to protect data is an example of what?

  • Data integrity
  • Data ethics
  • Data validation
  • Data security

8.

Question 8

To reduce clutter, a data analyst hides cells that contain long, complex formulas. The hidden cells allow the data analyst to protect their formulas and hide the data from other users with access to the spreadsheet.

  • True
  • False

*Course challenge*

1.

Question 1

Scenario 1, questions 1-5

You’ve been working at a data analytics consulting company for the past six months. Your team helps restaurants use their data to better understand customer preferences and identify opportunities to become more profitable.

To do this, your team analyzes customer feedback to improve restaurant performance. You use data to help restaurants make better staffing decisions and drive customer loyalty. Your analysis can even track the number of times a customer requests a new dish or ingredient in order to revise restaurant menus.

Currently, you’re working with a vegetarian sandwich restaurant called Garden. The owner wants to make food deliveries more efficient and profitable. To accomplish this goal, your team will use delivery data to better understand when orders leave Garden, when they get to the customer, and overall customer satisfaction with the orders.

Before project kickoff, you attend a discovery session with the vice president of customer experience at Garden. He shares information to help your team better understand the business and project objectives. As a follow-up, he sends you an email with datasets.

Click below to read the email:

C3 Scenario 1_Client Email .pdf

PDF File

And click below to access the datasets:

Course 3 Final Challenge Data Sets – Customer survey data (1)

CSV File

Course 3 Final Challenge Data Sets – Delivery times_distance (1)

CSV File

Reviewing the data enables you to describe how you will use it to achieve your client’s goals. First, you notice that all of the data was collected by Garden employees using their own resources. What type of data does this describe?

  • Nominal data
  • Third-party data
  • First-party data
  • Qualitative data

2.

Question 2

Scenario 1 continued

Next, you review the customer satisfaction survey data. To use the template for the customer satisfaction survey data, click the link below and select “Use Template.” 

Link to template: Customer Satisfaction Survey data

OR

If you don’t have a Google account, download the CSV file directly from the attachment below.

CustomerSurveyData – Customer survey data

CSV File

The question in column E asks, “Was your order accurate? Please respond yes or no.” What kind of data is this?

  • Second-party data
  • Ordinal data
  • Boolean data
  • Clean data

3.

Question 3

Scenario 1 continued

Now, you review the data on delivery times and the distance of customers from the restaurant.

To use the template for the dataset, click the link below and select “Use Template.” 

Link to template: Delivery Times/Distance

OR

If you don’t have a Google account, download the CSV file directly from the attachment below.

DeliveryTimes_DistanceData – Delivery times_distance

CSV File

Fill in the blank: The data in column E is an example of _____ data. Select all that apply.

  • quantitative
  • continuous
  • qualitative
  • discrete

4.

Question 4

Scenario 1 continued

The next thing you review is the file containing pictures of sandwich deliveries over a period of 30 days. This is an example of structured data.

  • True
  • False

5.

Question 5

Scenario 1 continued

Now that you’re familiar with the data, you want to build trust with the team at Garden.

What data-security measures do you employ? Select all that apply.

  • Add passwords to files
  • Assign user permissions for files
  • Make personal copies of client files
  • Change their file naming conventions

6.

Question 6

Scenario 2, questions 6-10

You’ve completed this program and are interviewing for a junior data scientist position at a company called Sewati Financial Services.

Click below to review the job description:

C3 Course Challenge Junior Data Scientist Job Description .pdf

PDF File

So far, you’ve successfully completed the first interview with a recruiter. They arrange your second interview with the team at Sewati Financial Services.

Click below to read the email from the human resources director:

Course 3 Scenario 2_Second Interview Email.pdf

PDF File

You arrive 15 minutes early for your interview. Soon, you are escorted into a conference room, where you meet Kai Harvey, the senior manager of strategy. After welcoming you, he begins the behavioral interview.

Consider and respond to the following question. Select all that apply.

Our data analytics team often surveys clients to get their feedback. If you were on the team, how would you ensure the results do not favor a particular person, group of people, or thing?

  • Ensure the survey sample represents the population as a whole.
  • Instruct participants to share their name and contact information.
  • Make sure the wording of the survey question does not encourage a specific response from participants.
  • Give participants enough time to answer each survey question.

7.

Question 7

Scenario 2 continued

Consider and respond to the following question. Select all that apply.

Our data analytics team often uses both internal and external data. Describe the difference between the two.

  • External data came from a company’s own systems. Internal data came from the organization.
  • External data is often generated from within the company. Internal data is generated outside the organization.
  • Internal data is often generated from within the company. External data is generated outside the organization.
  • Internal data came from a company’s own systems. External data comes from outside the organization.

8.

Question 8

Scenario 2 continued

Consider and respond to the following question. Select all that apply.

Our analysts often work within the same spreadsheet, but for different purposes. What tools would you use in such a situation?

  • Freeze the header rows
  • Filter to show only the data that meets a specific criteria
  • Sort the data to make it easier to understand, analyze, and visualize
  • Encrypt the spreadsheet so only you can access it

9.

Question 9

Scenario 2 continued

Next, your interviewer wants to better understand your knowledge of basic SQL commands. He asks: How would you write a query that retrieves only data about people who joined our firm in 2019 from the Clients table in our database?

Answer:

10.

Question 10

Scenario 2 continued

For your final question, your interviewer explains that Sewati Financial Services needs its clients’ trust, and this is an important responsibility for the data analytics team.

He asks you to identify which data analytics practice involves preserving a data subject’s information and activity any time a data transaction occurs.

  • Encryption
  • Sharing permissions
  • Bias
  • Data privacy

Trả lời

Email của bạn sẽ không được hiển thị công khai. Các trường bắt buộc được đánh dấu *