To Retrieve Only Those Records That Include People Who Have Served On The Board Of Trustees Or On The Board Of Directors, What Is The Correct Query? (2023)

1. To retrieve only those records that include people who have served on ...

  • To retrieve only those records that include people who have served on the board of trustees or on the board of directors, the syntax must include “OR.” ...

  • 7. Scenario 1, continuedYou finish cleaning your datasets, so you decide to review Tayen’s email one more time to make sure you completed the task fully. It’s a good thing you checked because you forgot to identify people who have served on the board of directors or board of trustees.…

2. To retrieve only those records that include people who have ... - Quizerry

  • To retrieve only those records that include people who have served on the board of trustees or on the board of directors, the correct query is: Related ...

  • 7. Scenario 1, continuedYou finish cleaning your datasets, so you decide to review Tayen’s email one more time to make sure you completed the task fully. It’s a good thing you checked because you forgot to identify people who have served on the board of directors or board of trustees.…

3. Week 4 - Course challenge - Shuffle Q/A 1 - Thewodm

  • Jul 8, 2023 · ... need to locate them in the database. To retrieve only those records that include people who have served on the board of trustees or on the ...

  • analyze data to answer questions course challenge answers: Shuffle Q/A 1

4. Week 4 - Course challenge - Thewodm

  • To retrieve only those records that include people who have served on the board of trustees or on the board of directors, you use the WHERE function. The ...

  • Analyze data to answer questions course challenge answers.

5. Google Data Analytics Professional Certificate Answers - Coursera

  • To retrieve only those records that include people who have served on the board of trustees or on the board of directors, what clause do you include in your ...

  • Google Data Analytics Professional Certificate Answers - CourseraWhether you’re just getting started or want to take the next step in the high-growth field of data analytics, professional certificates from Google can help you gain in-demand skills. You’ll learn about R programming, SQL, Python, Tableau and more.Data analysts prepare, process, and analyze data to help inform business decisions. They create visualizations to share their findings with stakeholders and provide recommendations driven by data.This certification is part of Google Career Certificates .Complete a Google Career Certificate to get exclusive access to CareerCircle, which offers free 1-on-1 coaching, interview and career support, and a job board to connect directly with employers, including over 150 companies in the Google Career Certificates Employer Consortium.Language: English Certification URLs:grow.google/certificates/data-analyticscoursera.org/google-certificates/data-analytics-certificateQuestions:Course 1 – Foundations: Data, Data, Everywhere Week 1 – Introducing data analytics Fill in the blank: A collection of elements that interact with one another to produce, manage, store, organize, analyze, and share data is known as a data ______ .environmentecosystemmodelcloud Fill in the blank: In data science, ________ is when a data analyst uses their unique past experiences to understand the story the data is telling.rational thoughtgut instinctpersonal opinionawareness Fill in the blank: When posting in a discussion forum, you should make sure that any articles discussed are _______ to data analytics.uniquewell knownrelevantpopular Data analysis is the various elements that interact with one another in order to provide, manage, store, organize, analyze, and share data.TrueFalse In data analytics, a model is a group of elements that interact with one another.TrueFalse Fill in the blank: The primary goal of a data _____ is to create new questions using data.designeranalystengineerscientist Fill in the blank: The term _____ is defined as an intuitive understanding of something with little or no explanation.personal opinionrational thoughtgut instinctawareness A company defines a problem it wants to solve. Then, a data analyst gathers relevant data, analyzes it, and uses it to draw conclusions. The analyst shares their analysis with subject-matter experts, who validate the findings. Finally, a plan is put into action. What does this scenario describe?Data scienceData-driven decision-makingCustomer serviceIdentification of trends What do subject-matter experts do to support data-driven decision-making? Select all that apply.Offer insights into the business problemReview the results of data analysis and identify any inconsistenciesCollect, transform, and organize dataValidate the choices made as a result of the data insights You have just finished analyzing data for a marketing project. Before moving forward, you share your results with members of the marketing team to see if they might have additional insights into the business problem. What practice does this support?Data analyticsData scienceData-driven decision-makingData management You read an interesting article about data analytics in a magazine and want to share some ideas from the article in the discussion forum. In your post, you include the author and a link to the original article. This would be an inappropriate use of the forum.TrueFalse Which of the following options describes data analysis?The various elements that interact with one another in order to provide, manage, store, organize, analyze, and share dataCreating new ways of modeling and understanding the unknown by using raw dataThe collection, transformation, and organization of data in order to draw conclusions, make predictions, and drive informed decision-makingUsing facts to guide business strategy In data analytics, what term describes a collection of elements that interact with one another to produce, manage, store, organize, analyze, and share data?The cloud environmentA modeling systemA data ecosystemA database Select the best description of gut instinct.Choosing facts that complement your personal experiencesAn intuitive understanding of something with little or no explanationManipulating data to match your intuitionUsing your innate ability to analyze results A furniture manufacturer wants to find a more environmentally friendly way to make its products. A data analyst helps solve this problem by gathering relevant data, analyzing it, and using it to draw conclusions. The analyst then shares their analysis with subject-matter experts from the manufacturing team, who validate the findings. Finally, a plan is put into action. This scenario describes data-driven decision making.TrueFalse Fill in the blank: _______ are an important part of data-driven decision-making because they are people familiar with the business problem and can offer insight into the results of data analysis.CustomersCompetitorsSubject-matter expertsStakeholders Consulting with experts in the marketing department about your marketing analysis is an example of what process?Data analyticsData-driven decision-makingData managementData science You have recently subscribed to an online data analytics magazine. You really enjoyed an article and want to share it in the discussion forum. Which of the following would be appropriate in a post? Select all that apply.Checking your post for typos or grammatical errors.Including an advertisement for how to subscribe to the data analytics magazine.Giving credit to the original author.Including your own thoughts about the article. Which of the following could be elements of a data ecosystem? Select all that applySharing dataProducing dataGaining insightsManaging data If you are using data-driven decision-making, what action steps would you take? Select all that apply.Surveying customers about results, conclusions, and recommendationsGathering and analyzing dataSharing your results with subject matter expertsDrawing conclusions from your analysis What do subject-matter experts do to support data-driven decision-making? Select all that apply.Collect, transform, and organize dataOffer insights into the business problemReview the results of data analysis and identify any inconsistenciesValidate the choices made as a result of the data insights Fill in the blank: When following data-driven decision-making, a data analyst will consult with ______ .subject matter expertsstakeholdersmanagerscustomers What is the purpose of data analysis? Select all that apply.To drive informed decision-makingTo create models of dataTo draw conclusionsTo make predictions A data analyst is someone who does what?Designs new productsCreates new questions using dataSolves engineering problemsFinds answers to existing questions by creating insights from data sources What tactics can a data analyst use to effectively blend gut instinct with facts? Select all that apply.Use their knowledge of how their company works to better understand a business need.Focus on intuition to choose which data to collect and how to analyze it.Ask how to define success for a project, but rely most heavily on their own personal perspective.Apply their unique past experiences to their current work, while keeping in mind the story the data is telling. To get the most out of data-driven decision-making, it’s important to include insights from people very familiar with the business problem. What are these people called?Subject-matter expertsCustomersStakeholdersCompetitors A music streaming service is looking to increase user engagement on their platform. The CEO decides to leverage the company's user data and tasks the data analysts with uncovering unknown trends and characteristics of the companies user base. This strategy is known as what?Data analytics decision-makingData science decision-makingData management decision-makingData-driven decision-making You read an interesting article in a magazine and want to share it in the discussion forum. What should you do when posting? Select all that apply.Check your post for typos or grammatical errorsInclude your email address for people to send questions or commentsMake sure the article is relevant to data analyticsTake credit for creating the article A data scientist is someone who does what?Creates new questions using dataFinds answers to existing questions by creating insights from data sourcesSolves engineering problemsSolves engineering problems Data analysts act as detectives to uncover clues within the data. Like a detective, a data analyst may use their _______ to solve business problems.personal opinionrational thoughtgut instinctawareness In data-driven decision-making, a data analyst would share their results with subject matter experts and draw conclusions from their analysis. What else would a data analyst do in data-driven decision-making?Identification of trendsDetermining the stakeholders.Survey customers about results, conclusions, and recommendationsGather and analyze data Fill in the blank: _________ is the act of consulting with subject-matter experts about the results of your data analysis.Data analyticsData scienceData managementData-driven decision-making Data ______ is the collection, transformation, and organization of data in order to draw conclusions, make predictions, and drive informed decision-making.scienceanalysisecosystemlife cycle Fill in the blank: The primary goal of a data _____ is to find answers to existing questions by creating insights from data sources.engineerscientistanalystdesigner Sharing your results with subject matter experts and gathering and analyzing data are carried out in data driven-decision-making. What else is included in this process?Determining the stakeholdersIdentification of trendsDrawing conclusions from your analysis.Surveying customers about results, conclusions, and recommendations Fill in the blank: The people very familiar with a business problem are called _____. They are an important part of data-driven decision-making.subject-matter expertscustomerscompetitorsstakeholders Fill in the blank: When posting in a discussion forum, you should always check your post for _______ and grammatical errorssupporttyposimportancepopularity Fill in the blank: Hardware, software, and the cloud all interact with each other to store and organize data in a _____.cloud environmentmodeling systemdatabasedata ecosystem Gut instinct is an intuitive understanding of something with little or no explanation.TrueFalse Week 2 – All about analytical thinking Fill in the blank: Gathering additional information about data to understand the broader picture is an example of understanding _____.problemsdataknowledgecontext Correlation is the aspect of analytical thinking that involves figuring out the specific details that help you execute a plan.TrueFalse What method involves asking multiple questions in order to get to the root cause of a problem?The five whysStrategizingCuriosityInquiry A junior data analyst is seeking out new experiences in order to gain knowledge. They watch videos and read articles about data analytics. They ask experts questions. Which analytical skill are they using?Data strategyHaving a technical mindsetCuriosityUnderstanding context Identifying the motivation behind data collection and gathering additional information are examples of which analytical skill?Data designA technical mindsetUnderstanding contextData strategy Having a technical mindset is an analytical skill involving what?Managing people, processes, and toolsUnderstanding the condition in which something exists or happensBreaking things down into smaller steps or piecesBalancing roles and responsibilities Fill in the blank: Data strategy involves _____ the people, processes, and tools used in data analysis.supervisingmanagingchoosingvisualizing Correlation is the aspect of analytical thinking that involves figuring out the specifics that help you execute a plan.TrueFalse What method involves asking numerous questions in order to get to the root cause of a problem?StrategizingThe five whysCuriosityInquiry Gap analysis is a method for examining and evaluating how a process works currently in order to get where you want to be in the future.TrueFalse Data-driven decision-making involves the five analytical skills: curiosity, understanding context, having a technical mindset, data design, and data strategy. Each plays a role in data-driven decision-making.TrueFalse Shuffle Q/AFill in the blank: The analytical skill of ______ involves seeking out new experiences in order to gain knowledge.understanding contexthaving a technical mindsetdata strategycuriosity Breaking things down into smaller steps or pieces and working with them in an orderly and logical way describes which analytical skill?Data strategyContextCuriosityA technical mindset In data analysis, data strategy is the analytical skill that involves managing which of the following? Select all that apply.PeopleConsentToolsProcesses A grocery store owner notices that they sell more orange juice during the winter season, when people are more likely to get sick. After observing this for a couple of years, they decide to stock more orange juice during the winter. The store owner is using which quality of analytical thinking?detail-oriented thinkingcorrelationproblem-orientationvisualization The five whys is a technique that involves asking, “Why?” five times in order to achieve what goal?Identify the root cause of a problemVisualize how a process should look in the futurePut a plan into actionUse facts to guide business strategy In data analysis, one often examines and evaluates how a process currently works in order to get it to where they want it to be in the future. This is known as what?Building a data visualizationGap analysisDetermining the stakeholdersAsking the five whys A company is seeing a decline in organizational efficiency. They decide to hire an outside organization to help increase overall performance. The data analyst, working for the newly contracted company, utilizes five analytical skills: curiosity, understanding context, having a technical mindset, data design, and data strategy to deliver the project goals. Once the project goals are met, the analyst informs the decision makers of their findings and the project is completed. What strategy did the data analyst use to complete this project?Gut instinctGap analysisData-driven decision-makingThe five whys Identifying the motivation behind the collection of a dataset is an example of the analytical skill of understanding context.TrueFalse A technical mindset involves the ability to break things down into smaller steps or pieces and work with them in an orderly and logical way.TrueFalse Data design is how you organize information; data strategy is the management of the people, processes, and tools used in data analysis.TrueFalse What method involves examining and evaluating how a process works currently in order to get where you want to be in the future?The five whysStrategyGap analysisData visualization Seeking out new challenges and experiences in order to learn is an example of which analytical skill?CuriosityData strategyUnderstanding contextHaving a technical mindset Which of the following examples best describe the analytical skill of understanding context? Select all that apply.Adding descriptive headers to columns of data in a spreadsheetWorking with facts in an orderly mannerGathering additional information about data to understand the broader pictureIdentifying the motivation behind the collection of a dataset Fill in the blank: In data analysis, data strategy involves managing the people, processes, and _____ .projectsproceduresconsenttools Identifying a relationship between two or more pieces of data is known as what?visualizationcorrelationproblem-orientationdetail-oriented thinking As a new data analyst, your boss asks you to perform a gap analysis on one of their current processes. What does this entail?Building a data visualizationAsking the five whysExamining and evaluating how a process works currently in order to get where you want to be in the futureDetermining the stakeholders Fill in the blank: In data-driven decision making, data analysts use five analytical skills of curiosity, understanding context, having a technical mindset, data design, and _______ .data strategyforward-lookingintuitionefficiency The analytical skill of understanding context entails which of the following?Breaking things down into smaller steps or piecesManaging people, processes, and toolsBalancing roles and responsibilitiesUnderstanding the condition in which something exists or happens Fill in the blank: _____ involves the ability to break things down into smaller steps or pieces and work with them in an orderly and logical way.Data strategyCuriosityContextA technical mindset Which analytical skill involves managing the people, processes, and tools used in data analysis?Understanding contextData designData strategyCuriosity The manager at a music shop notices that more trombones are repaired on the days when Alex and Jasmine work the same shift. After some investigation, the manager discovers that Alex is excellent at fixing slides, and Jasmine is great at shaping mouthpieces. Working together, Alex and Jasmine repair trombones faster. The manager is happy to have discovered this relationship and decides to always schedule Alex and Jasmine for the same shifts. In this scenario, the manager used which quality of analytical thinking?VisualizationProblem-orientationCorrelationBig-picture thinking Fill in the blank: In order to get to the root cause of a problem, a data analyst should ask “Why?” ________ times.fivethreesevenfour A company is receiving negative comments on social media about their products. To solve this problem, a data analyst uses each of their five analytical skills: curiosity, understanding context, having a technical mindset, data design, and data strategy. This makes it possible for the analyst to use facts to guide business strategy and figure out how to improve customer satisfaction. What is this an example of?Data scienceGap analysisData-driven decision-makingData visualization Data analysts following data-driven decision-making use the analytical skills of curiosity, having a technical mindset, and data design. What other two analytical skills would they employ? Select all that apply.knowledgedata strategyefficiencyunderstanding context Curiosity is the analytical skill of using your instinct to solve problems.TrueFalse Adding descriptive headers to columns of data in a spreadsheet is an example of which analytical skill?Having a technical mindsetUnderstanding contextData strategyCuriosity A company has recently tasked their data science team with figuring out what is causing the decline in production at one of their plants. The data analysts ask a number of questions trying to get to the root cause of the problem. This technique is known as what?InquiryThe five whysCuriosityStrategizing Week 3 – The wonderful world of data A business analyst recently completed a project that their company has decided to use to solve a larger business problem. What step is this in the data analysis process?ProcessAnalyzeActShare A set of instructions used to perform a specified calculation is known as what?A particular valueA functionA predefined statementA formula Which of the following is an example of why a data analyst may generate a query?Visualizing dataRequesting dataCollecting dataRecording data Fill in the blank: A business decides what kind of data it needs, how the data will be managed, and who will be responsible for it during the _____ stage of the data lifeanalyzemanageplancapture The destroy stage of the data life cycle might involve which of the following actions? Select all that apply.Storing data for future useShredding paper filesUploading data to the cloudUsing data-erasure software During the capture stage of the data life cycle, a data analyst may use spreadsheets to aggregate data.TrueFalse Describe how the data life cycle differs from data analysis.The data life cycle deals with making informed decisions; data analysis is using tools to transform data.The data life cycle deals with transforming and verifying data; data analysis is using the insights gained from the data.The data life cycle deals with identifying the best data to solve a problem; data analysis is about asking effective questions.The data life cycle deals with the stages that data goes through during its useful life; data analysis is the process of analyzing data. What actions might a data analytics team take in the act phase of the data analysis process? Select all that apply.Sharing analysis results using data visualizationsPutting a plan into action to help solve the business problemValidating insights provided by analystsFinalizing a strategy based on the analysis Fill in the blank: A formula is a set of instructions used to perform a specified calculation; whereas a function is _____.a predefined operationa question written by the usera particular valuea computer programming language Fill in the blank: To request, retrieve, and update information in a database, data analysts use a ____.calculationdashboardqueryformula Structured query language (SQL) enables data analysts to communicate with a database.TrueFalse Shuffle Q/AYou are in the plan stage of the data lifecycle for your current project. What action might you take during this stage?Decide what kind of data is needed.Use a formula to perform calculations.Validate insights provided by analysts.Shred paper files. A data analyst is working at a small tech startup. They’ve just completed an analysis project, which involved private company information about a new product launch. In order to keep the information safe, the analyst uses secure data-erasure software for the digital files and a shredder for the paper files. Which state of the data life cycle does this describe?ManageArchiveDestroyPlan A data analyst is working at a small tech startup. On their current project they are in the analyze stage of the data life cycle. What might they do in this stage?Choose the format of their spreadsheet headingsDetermine who is responsible for managing the dataValidate the insights provided by analystsUse a formula to perform calculations Fill in the blank: Data analysis has six process steps whereas the data life cycle has six _____.data analytics toolsstepsstageskey questions What is the main difference between a formula and a function?A formula can be used multiple times in a spreadsheet; a function can only be used once.A formula begins with an equal sign (=); a function begins with an asterisk (*).A formula is a set of instructions used to perform a specified calculation; a function is a preset command that automatically performs a specified process.A formula is used to add or subtract; a function is used to multiply or divide. What does a data analyst use to request information within a database?CalculationDashboardFormulaQuery Why is SQL the most popular structured query language? Select all that apply.SQL allows data analysts to use spreadsheetsSQL is the most secure database on the marketSQL is easy to understandSQL works with a wide variety of databases A data analyst uses spreadsheets to aggregate data during the capture phase of the data life cycle.TrueFalse Fill in the blank: The data life cycle has six _____ .data analytics toolsprocess stepskey questionsstages Fill in the blank: A query is used to _____ information from a database. Select all that apply.requestretrievevisualizeupdate Structured query language (SQL) allows a data analyst to retrieve and request data from a database. What else is SQL used for?Visualizing data within a databaseThe revising phase of the data life cycleUpdating databasesThe sharing phase of the data life cycle Fill in the blank: A business is determining who should be responsible for the data in their current data analysis project. This means that the company is in the ______ stage of the data life cycle.manageplananalyzecapture Fill in the blank: Shredding paper files and using data-erasure software would be actions taken by a data analyst in the _________ stage of the data lifecycle.ManagePlanArchiveDestroy Fill in the blank: Data analysis has six parts that are divided into distinct _____.process stepskey questionsdata analytics toolsstages Fill in the blank: In the _____ phase of the data analysis process, a data analytics team might validate the insights provided by analysts.processshareanalyzeact In data analysis, a predefined operation is known as what?A functionA formulaA particular valueA predefined statement In the course of their current project, a data analyst uses a query to retrieve and request information. Which of the following is a third option they can use a query for?Visualizing dataUpdating dataDeleting dataCollecting data In which stage of the data life cycle does a business decide what kind of data it needs, how the data will be managed, and who will be responsible for it?ManagePlanCaptureAnalyze A company takes the insights provided by its data analytics team, validates them, and finalizes a strategy. They then implement a plan to solve the original business problem. This describes the share step of the data analysis process.TrueFalse In the course of their current project, a data analyst uses a query to retrieve and request information. Which of the following are options the analyst can use a query for? Select all that apply.Updating dataCollecting dataVisualizing dataDeleting data In the plan stage of the data life cycle, what decisions would a data analyst make? Select all that apply.Who will be responsible for the dataHow the data will be managedWhat kind of data is neededHow the data will be analyzed In the analyze phase of the data life cycle, what might a data analyst do? Select all that apply.Use spreadsheets to aggregate dataUse a formula to perform calculationsCreate a report from their dataChooses the format of their spreadsheet headings Fill in the blank: In the act phase of the data analysis process, a company may need to _____ the insights of the data analysis team.accomplishrevisevalidatecalculate In data analysis, a function is a predefined operation whereas a formula is a set of instructions used to carry out a specific calculation.TrueFalse A data analyst has finished an analysis project that involved private company data. They erase the digital files in order to keep the information secure. This describes which stage of the data life cycle?PlanDestroyArchiveManage Fill in the blank: Using a formula to perform calculations, creating a report from their data, and using spreadsheets to aggregate data would all be actions carried out in the ________ stage of the data lifecycle.manageplananalyzecapture Fill in the blank: In the _____ phase of the data analysis process, a data analytics team might validate the insights provided by analysts.processactanalyzeshare Fill in the blank: Structured query language (SQL) enables data analysts to _____ information from a database. Select all that apply.retrievevisualizerequestupdate Week 4 – Set up your toolbox You are working with a database table named employee that contains data about employees. You want to review all the columns in the table.You write the SQL query below. Add a FROM clause that will retrieve the data from the employee table.SELECT*FROM employeeWhat employee has the job title of Sales Manager?Margaret ParkAndrew AdamsNancy EdwardsMichael Mitchell A data analyst creates the following visualization to clearly demonstrate how much more populous Charlotte is than the next-largest North Carolina city, Raleigh. What type of chart do they use?A line chartA column, or bar, chartA scatter chartA pie chart Fill in the blank: A data analyst has to demonstrate how the population in a city has increased over time. In particular, they want to be able to see when the population has exceeded certain thresholds. The chart that would work best for this is a/an _____ chart.arealinecolumnbar In the following spreadsheet, the column labels in row 1 are called what?CriteriaAttributesDescriptorsCharacteristics Fill in the blank: In row 8 of the following spreadsheet, you can find the _____ of Cary.formatattributecriteriaobservation Fill in the blank: In the following spreadsheet, the _____ feature was used to alphabetize the city names in column B.Organize rangeName rangeRandomize rangeSort range A data analyst types =POPULATION(C2:C11) to find the average population of the cities in this spreadsheet. However, they realize they used the wrong formula. What syntax will correct this function?=AVERAGE(C2-C11)AVERAGE(C2:C11)AVERAGE(C2-C11)=AVERAGE(C2:C11) You are working with a database table named genre that contains data about music genres. You want to review all the columns in the table. You write the SQL query below. Add a FROM clause that will retrieve the data from the genre table.What is the name of the genre with ID number 3?JazzRockMetalBlues You are working with a database table that contains invoice data. The customer_id column lists the ID number for each customer. You are interested in invoice data for the customer with ID number 35.You write the SQL query below. Add a WHERE clause that will return only data about the customer with ID number 35.After you run your query, use the slider to view all the data presented.What is the billing country for the customer with ID number 35?IrelandArgentinaPortugalIndia A data analyst creates the following visualization to clearly demonstrate how much more populous Charlotte is than the next-largest North Carolina city, Raleigh. What type of chart is it?A scatter chartA column, or bar, chartA line chartA pie chart A data analyst wants to demonstrate a trend of how something has changed over time. What type of chart is best for this task?AreaColumnLineBar Shuffle Q/AFill in the blank: In row 1 of the following spreadsheet, the words rank and name are called _____?attributescharacteristicscriteriadescriptors In the following spreadsheet, where can you find all of the attributes—also known as the observation—of Fayetteville?Row 7Column BRow 6Cell B7 Fill in the blank: In the following spreadsheet, the feature sort range can be used to ________ the city names in column B?changealphabetizerandomizedelete The function =AVERAGE(C2:C11) can be used to do what for the following spreadsheet?Arrange the rows according to increasing population size.Find the city with the largest population.Arrange the rows according to decreasing population size.Find the average population of the cities You are working with a database table named employee that contains data about employees. You want to review all the columns in the table. You write the SQL query below. Add a FROM clause that will retrieve the data from the employee table.What employee has the job title of Sales Manager?Nancy EdwardsMargaret ParkMichael MitchellAndrew Adams You are working with a database table that contains invoice data. The customer_id column lists the ID number for each customer. You are interested in invoice data for the customer with ID number 40. You write the SQL query below. Add a WHERE clause that will return only data about the customer with ID number 40.After you run your query, use the slider to view all the data presented. What is the billing city for the customer with ID number 40?ParisDijonLondonBuenos Aires A data analyst has to create a visualization that makes it easy to show which of the top ten most populous cities in North Carolina have a population below 250,000 people. What type of chart would be best for this visualization?Line chartPie chartBar chartScatter chart A data analyst wants to demonstrate how the population in Charlotte has increased over time. They create this data visualization. This is an example of an area chart.TrueFalse In row 1 of the following spreadsheet, the words rank, name, population, and county are called what?AttributesDescriptorsCriteriaCharacteristics In the following spreadsheet, what feature was used to alphabetize the city names in column B?Organize rangeSort rangeName rangeRandomize range To find the average population of the cities in this spreadsheet, you type =AVERAGE. What is the proper way to type the range that will complete your function? (C2,C11)(C2-C11)(C2:C11)(C2*C11) You are working with a database table named playlist that contains data about playlists for different types of digital media. You want to review all the columns in the table. You write the SQL query below. Add a FROM clause that will retrieve the data from the playlist table.What is the playlist with ID number 3?AudiobooksMusicMoviesTV Shows You are working with a database table that contains invoice data. The customer_id column lists the ID number for each customer. You are interested in invoice data for the customer with ID number 28. You write the SQL query below. Add a WHERE clause that will return only data about the customer with ID number 28.After you run your query, use the slider to view all the data presented. What is the billing city for the customer with ID number 28?BangaloreBuenos AiresDijonSalt Lake City Which of the following best describes a bar chart?It is a visualization that uses a circle which is divided into wedges sized based on numerical proportion.It is a visualization that plots a sequence of points and connects them with them with straight lines or curves.It is a visualization that represents data with columns, or bars, the heights of which are proportional to the values that they represent.It is a visualization that plots individual points in the Cartesian coordinate plane. A data analyst has to create a visualization that clearly shows when and for how long the population of Charlotte has been above one million people. They choose to use a line chart. Why is this the best choice for their visualization?It is a visualization that plots a sequence of points and connects them with straight lines or curves.It is a visualization that uses a circle which is divided into wedges sized based on numerical proportion.It is a visualization that represents data with columns, or bars, the heights of which are proportional to the values that they represent.It is a visualization that plots individual points in the Cartesian coordinate plane. The words rank, name, population, and county in row 1 of the following spreadsheet are known as descriptors.TrueFalse Fill in the blank: In the following spreadsheet, the ________ of High Point describes all of the data in row 10.criteriadatasetobservationformat If a data analyst wants to list the cities in this spreadsheet alphabetically, instead of numerically, what feature can they use in column B?Sort rangeName rangeRandomize rangeOrganize range A data analyst wants to create a visualization that depicts the populations of the top ten most populous cities in North Carolina. What type of chart would be best for this?A pie chartA scatter chartA column, or bar, chartA line chart A data analyst has to demonstrate a trend of how something has changed over time. What type of chart is best for this task?LineAreaBarColumn You are working with a database table that contains invoice data. The customer_id column lists the ID number for each customer. You are interested in invoice data for the customer with ID number 54. You write the SQL query below. Add a WHERE clause that will return only data about the customer with ID number 54.After you run your query, use the slider to view all the data presented. What is the billing address for the customer with ID number 54?1033 N Park Ave230 Elgin St110 Raeburn Pl801 W 4th St Fill in the blank: A data analyst creates a table, but they realize this isn’t the best visualization for their data. To fix the problem, they decide to use the ____ feature to change it to a column chart.chart editorrenamefilter viewimage You are working with a database table named employee that contains data about employees. You want to review all the columns in the table. You write the SQL query below. Add a FROM clause that will retrieve the data from the employee table.What is the job title of Andrew Adams?General ManagerSales ManagerSales Support AgentIT Manager Fill in the blank: Suppose you wanted to determine the average population of the cities in the following spreadsheet. The correct function syntax to use would be ________ .=AVERAGE(C2-C11)AVERAGE(D2:D11)AVERAGE(C2:C11)=AVERAGE(C2:C11) Week 5 – Endless career possibilities A college IT department needs to reduce the number of computers on campus for student use. How could a data analyst help identify a solution to this problem?Analyze the number of classes schedules across all classroomsAnalyze the utilization of the computer labs on campusAnalyze data on the number of students enrolledAnalyze the square footage of all computer labs on campus In data analytics, what is the term for an obstacle to be solved?IssueQuestionProblemSolution An online gardening magazine wants to understand why its subscriber numbers have been increasing. A data analyst discovers that significantly more people subscribe when the magazine has its annual 50%-off sale. This is an example of what?Analyzing consumer preferences using artificial intelligenceAnalyzing customer buying behaviorsAnalyzing social media engagementAnalyzing the number of customers by calculating daily foot traffic Fill in the blank: A doctor’s office has discovered that patients are waiting 20 minutes longer for their appointments than in past years. To help solve this problem, a data analyst could investigate how many nurses are on staff at a given time compared to the number of _____.doctors seeing new patientspatients with appointmentsnegative comments about the wait times on social mediadoctors on staff at the same time A problem is an obstacle to be solved, an issue is a topic to investigate, and a question is designed to discover information.TrueFalse What is a question or problem that a data analyst answers for a business?Mission statementHypothesisComplaintBusiness task Fill in the blank: Data-driven decision-making is described as using _____ to guide business strategy.gut instinctvisualizationsfactsintuition It’s possible for conclusions drawn from data analysis to be both true and unfair.TrueFalse A data analyst is analyzing fruit and vegetable sales at a grocery store. They’re able to find data on everything except red onions. What’s the best course of action?Ask a teammate for help finding data on red onions.Exclude red onions from the analysis.Exclude all onion varieties from the analysis.Use the data on white onions instead, as they’re both onion varieties. Collaborating with a social scientist to provide insights into human bias and social contexts is an effective way to avoid bias in your data.TrueFalse Shuffle Q/AA restaurant hires a data analyst to determine the best times to have the restaurant open. Which of the following methods can the data analyst use to help build a better schedule for the restaurant? Select all that apply.Analyze weekly weather dataAnalyze staffing levels for different daysExamine hourly customer numbersSurvey customers on their preferred times to dine A restaurant has noticed that customers often wait longer in line than in previous years. How could a data analyst help solve this problem?Analyze the average sales amount per customerAnalyze customer survey results about the preferred opening hours of the restaurantAnalyze the number of staff on shift at any timeAnalyze the products customers are purchasing Fill in the blank: A business task is described as the _____ a data analyst answers for a business.solutioncomplaintquestioncomment When you make decisions using observation and intuition as a guide, you only see part of the picture. What can improve your decision-making?Using dataUsing assumptionsCreating surveysBeing decisive Data analysts ensure their analysis is fair for what reason?Fairness helps them avoid biased conclusions.Fairness helps them stay organized.Fairness helps them communicate with stakeholders.Fairness helps them pick and choose which data to include from a dataset. A large hotel chain sees about 500 customers per week. A data analyst working there is gathering data through customer satisfaction surveys. They are anxious to begin analysis, so they start analyzing the data as soon as they receive 50 survey responses. This is an example of what? Select all that apply.Failing to include diverse perspectives in data collectionFailing to collect data anonymouslyFailing to reward customers for participating in the surveyFailing to have a large enough sample size An online gardening magazine wants to understand why its subscriber numbers have been increasing. What kind of reports can a data analyst provide to help answer that question? Select all that apply.Reports that describe how many customers shared positive comments about the gardening magazine on social media in the past yearReports that predict the success of sales leads to secure future subscribersReports that examine how a recent 50%-off sale affected the number of subscription purchasesReports that compare past weather patterns to the number of people asking gardening questions to their social media Fill in the blank: In data analytics, a question is _____.an obstacle or complication that needs to be worked outa way to discover informationa topic to investigatea subject to analyze What must a data analyst establish before they can start to plan the best approach to gather and analyze information?The business taskThe statementThe complaintThe solution What is the process of using facts to guide business strategy?Data-driven decision-makingData ethicsData visualizationData programming A data analyst is developing a model. They start by gathering data for groups that are underrepresented in a sample. What strategy could they employ to ensure these groups are represented fairly?Oversample the underrepresented groupSample the underrepresented group normallyCombine the underrepresented group with another groupExclude the underrepresented group from the sample A restaurant is trying to develop more effective staffing strategies. A data analyst recognizes that there are significantly fewer customers earlier in the business day. They conclude that opening later would be more effective for staffing. What is this an example of?Creating efficiencies by analyzing customer foot trafficTailoring products to consumer buying habitsCreating more effective customer communicationGathering customer opinions about business changes A restaurant has noticed many popular dishes are running out early in the day. How could a data analyst help identify a solution to this problem? Select all that apply.Analyze ordering patterns of those productsExamine the number of sales of those productsExamine overall daily sales of the restaurantAnalyze the number of staff on shift during peak times When working for a restaurant, a data analyst is asked to examine and report on the daily sales data from year to year to help with making more efficient staffing decisions. What is this an example of?A business taskAn issueA solutionA breakthrough Data-driven decision-making is using facts to guide business strategy. The benefits include which of the following? Select all that apply.Getting a complete picture of a problem and its causesCombining observation with objective dataUsing data analytics to find the best possible solution to a problemMaking the most of intuition and gut instinct A data analyst is analyzing fruit and vegetable sales at a grocery store. They’re able to find data on everything except red onions. If they exclude red onions from the analysis, this would be an example of creating or reinforcing bias.TrueFalse A hotel is trying to gather data on their guests' satisfaction with their stay. Which of the following options would best help the hotel account for potential bias in their data?Surveying guests at random times throughout the yearOnly surveying guests who have booked their stay through a certain third-party websiteOnly surveying guests who have stayed at the hotel during peak seasonOnly surveying guests who have stayed at the hotel for more than 3 nights A restaurant is struggling to accurately staff for the different daily customer volumes. On some days, there are many servers and few customers. On other days, the restaurant is very busy and there are not enough servers and kitchen staff. What reports could a data analyst use to create more efficient staffing strategies? Select all that apply.Reports of past and future reservationsReports of past weather patterns in the area of the restaurantReports using historical sales data to predict sales for the current day/dateReports of planned local events in the area of the restaurant Fill in the blank: In data analytics, a topic to investigate is also known as a(n) _____.themeissuequestionstatement When a choice is made between good, bad, or a combination of consequences based on facts, it is also known as what?Data-driven decision-makingData ethicsData visualizationData programming At what point in the data analysis process should a data analyst consider fairness?When decisions are made based on the conclusionsWhen data collection beginsWhen data is being organized for reportingWhen conclusions are presented A restaurant is considering changing their operating hours. They survey customers that come in between 4 p.m. and 5 p.m. to get feedback on this potential change. What can the restaurant do to ensure the data analysis process is fair?Expand the times when they survey customersSurvey only repeat customersReward customers for participating in the surveySurvey people walking by on the street A doctor’s office discovers that patients are waiting 20 minutes longer for their appointments than in past years. In what ways could a data analyst help solve this problem? Select all that apply.Analyze the average length of an appointment this year compared to past years.Analyze the number of patients seen per day compared to past years.Analyze a recent change in the average rating for the doctor’s office on social media.Analyze how many doctors and nurses are on staff at a given time compared to the number of patients with appointments Fill in the blank: Fairness is achieved when data analysis doesn’t create or _____ bias.reinforceconstrainhighlightresolve A gym wants to start offering exercise classes. A data analyst plans to survey 10 people to determine which classes would be most popular. To ensure the data collected is fair, what steps should they take? Select all that apply.Ensure participants represent a variety of profiles and backgrounds.Collect data anonymously.Survey only people who don’t currently go to the gym.Increase the number of participants. A doctor’s office has discovered that patients are waiting 20 minutes longer for their appointments than in past years. A data analyst could help solve this problem by analyzing how many doctors and nurses are on staff at a given time compared to the number of patients with appointments.TrueFalse Fill in the blank: Once an analyst has identified a problem for a business, they establish a(n)_____ to help inform the process of gathering the correct information.issuebusiness taskstatementsolution Which of the following best describes what fairness in data analytics means?Ensuring that analysis does not create or reinforce biasIncluding data from dominant groupsCollecting data objectivelyIncluding self-reported data Course Challenge Scenario 1, question 1-5You’ve just started a new job as a data analyst for a midsized pharmacy chain with 38 stores in the American Southwest. Your supervisor shares a new data analysis project with you.She explains that the pharmacy is considering discontinuing a bubble bath product called Splashtastic. Your supervisor wants you to analyze sales data and determine what percentage of each store’s total daily sales come from that product. Then, you’ll present your findings to leadership.You know that it's important to follow each step of the data analysis process: ask, prepare, process, analyze, share, and act. So, you begin by defining the problem and making sure you fully understand stakeholder expectations.One of the questions you ask is where to find the dataset you’ll be working with. Your supervisor explains that the company database has all the information you need.Next, you continue to the prepare step. You access the database and write a query to retrieve data about Splashtastic. You notice that there are only 38 rows of data, representing the company’s 38 stores. In addition, your dataset contains five columns: Store Number, Average Daily Customers, Average Daily Splashtastic Sales (Units), Average Daily Splashtastic Sales (Dollars), and Average Total Daily Sales (All Products). You decide to use a spreadsheet to work with the data because you know that spreadsheets work well for processing and analyzing a small dataset, like the one you’re using.Fill in the blank: To get the data from the database into a spreadsheet, you would first _____ the data as a .CSV file, then import it into a spreadsheet.emaildownloadcopy and pasteprint Scenario 1 continuedYou’ve downloaded the data from your company database and imported it into a spreadsheet. IMPORTANT: To answer questions using this dataset for the scenario, click the link below and select the “Use Template” button before answering the questions.Link to template: Course Challenge - Scenario 1ORIf you don’t have a Google account, you can download the template directly from the attachment below.Course Challenge Dataset - Scenario 1 - Scenario 1_ Pharmacy Data - Part 1CSV FileNow, it’s time to process the data. As you know, this step involves finding and eliminating errors and inaccuracies that can get in the way of your results. While cleaning the data, you notice that information about Splashtastic is missing for Store Number 15 in Row 16. Which of the following would be an appropriate course of action?Delete the row with the missing data point.Replace the row with the average values of the other data points.Sort the spreadsheet so the row with missing data is at the bottom.Investigate previous projects and see how this was dealt with there. Scenario 1 continuedOnce you’ve found the missing information, you analyze your dataset.During analysis, you create a new column F. At the top of the column, you add: Average Percentage of Total Sales - Splashtastic. What is this column label called?A titleA referenceAn attributeA headline Scenario 1 continuedNext, you determine the average total daily sales over the past 12 months at all stores. The entire range of cells that contain these sales are E2:E39. The correct syntax is =AVERAGE(E2:E39).TrueFalse Scenario 1 continuedFill in the blank: You’ve reached the share phase of the data analysis process. One of the things that you can do in this phase is to prepare a _____ about Splashtastic’s sales and practice your presentation.predictionfindingrecordslideshowScenario 2, questions 6-10You’ve been working for the nonprofit National Dental Society (NDS) as a junior data analyst for about two months. The mission of the NDS is to help its members advance the oral health of their patients. NDS members include dentists, hygienists, and dental office support staff.The NDS is passionate about patient health. Part of this involves automatically scheduling follow-up appointments after crown replacement, emergency dental surgery, and extraction procedures. NDS believes the follow-up is an important step to ensure patient recovery and minimize infection.Unfortunately, many patients don’t show up for these appointments, so the NDS wants to create a campaign to help its members learn how to encourage their patients to take follow-up appointments seriously. If successful, this will help the NDS achieve its mission of advancing the oral health of all patients.Your supervisor has just sent you an email saying that you’re doing very well on the team, and he wants to give you some additional responsibility. He describes the issue of many missed follow-up appointments. You are tasked with analyzing data about this problem and presenting your findings using data visualizations.An NDS member with three dental offices in Colorado offers to share its data on missed appointments. So, your supervisor uses a database query to access the dataset from the dental group. The query instructs the database to retrieve all patient information from the member’s three dental offices, located in zip code 81137.The table is dental_data_table, and the column name is zip_code. You write the following query, but get an error. What statement will correct the problem?SELECT * FROM dental_data_table WHERE zip code = 81137zip_code = 81137WHERE_zip code = 81137WHERE zip_code = 81137WHERE 81137 Scenario 2 continuedThe dataset your supervisor retrieved and imported into a spreadsheet includes a list of patients, their demographic information, dental procedure types, and whether they attended their follow-up appointment. To use the dataset for this scenario, click the link below and select “Use Template.”Link to template: Course Challenge - Scenario 2ORIf you don’t have a Google account, you can download the template directly from the attachment below.Course Challenge Dataset - Scenario 2CSV FileThe patient demographic information includes data such as age, gender, and home address. When examining the geographic data, you notice that all the patients live in the same zip code.Fill in the blank: The fact that the dataset includes people who all live in the same zip code might get in the way of ______.fairnessaccuracyspreadsheet formulas or functionsdata visualization Scenario 2 continuedAs you’re reviewing the dataset, you notice that there are a disproportionate number of senior citizens. So, you investigate further and find out that this zip code represents a rural community in Colorado with about 800 residents. In addition, there’s a large assisted-living facility in the area. Nearly 300 of the residents in the 81137 zip code live in the facility.You recognize that’s a sizable number, so you want to find out if age has an effect on a patient’s likelihood to attend a follow-up dental appointment. You analyze the data, and your analysis reveals that older people tend to miss follow-ups more than younger people.So, you do some research online and discover that people over the age 60 are 50% more likely to miss dentist appointments. Sometimes this is because they’re on a fixed income. Also, many senior citizens lack transportation to get to and from appointments.With this new knowledge, you write an email to your supervisor expressing your concerns about the dataset. He agrees with your concerns, but he’s also impressed with what you’ve learned and thinks your findings could be very important to the project. He asks you to change the business task. Now, the NDS campaign will be about educating dental offices on the challenges faced by senior citizens and finding ways to help them access quality dental care.Fill in the blank: Changing the business task involves defining a new _____.gap analysis plangraphical representation of the dataquestion or problem to be solveddata-cleaning strategy Scenario 2 continuedYou continue with your analysis. In the end, your findings support what you discovered during your online research: As people get older, they’re less likely to attend follow-up dental visits.But you’re not done yet. You know that data should be combined with human insights in order to lead to true data-driven decision-making. So, your next step is to share this information with people who are familiar with the problem professionally. They’ll help verify the results of your data analysis.Fill in the blank: The people who are familiar with a problem and help verify the results of data analysis are _____.customersdata scientistsstakeholderssubject-matter experts Scenario 2 continuedThe subject-matter experts are impressed by your analysis. The team agrees to move to the next step: data visualization. You know it’s important that stakeholders at NDS can quickly and easily understand that older people are less likely to attend important follow-up dental appointments than younger people. This will help them create an effective campaign for members.It’s time to create your presentation to stakeholders. It will include a data visualization that demonstrates the lifetime trend of people being less likely to attend follow-up appointments as they get older.Why would a line chart be the most effective in representing this?Line charts are effective in displaying points in series.Line charts arrange data values into rows.Line charts represent data values as proportionally sized wedges.Line charts arrange data values into columns. Scenario 1, question 1-5You’ve just started a new job as a data analyst. You’re working for a midsized pharmacy chain with 38 stores in the American Southwest. Your supervisor shares a new data analysis project with you.She explains that the pharmacy is considering discontinuing a bubble bath product called Splashtastic. Your supervisor wants you to analyze sales data and determine what percentage of each store’s total daily sales come from that product. Then, you’ll present your findings to leadership.You know that it's important to follow each step of the data analysis process: ask, prepare, process, analyze, share, and act. So, you begin by defining the problem and making sure you fully understand stakeholder expectations.One of the questions you ask is where to find the dataset you’ll be working with. Your supervisor explains that the company database has all the information you need. Next, you continue to the prepare step. You access the database and write a query to retrieve data about Splashtastic. You notice that there are only 38 rows of data, representing the company’s 38 stores. In addition, your dataset contains five columns: Store Number, Average Daily Customers, Average Daily Splashtastic Sales (Units), Average Daily Splashtastic Sales (Dollars), and Average Total Daily Sales (All Products).You know that spreadsheets work well for processing and analyzing a small dataset, like the one you’re using. To get the data from the database into a spreadsheet, what should you do?Email a copy of the dataset to your company email address.Use Tableau to convert the data into a spreadsheet.Download the data as a .CSV file, then import it into a spreadsheet.Copy and paste the data into a spreadsheet. Scenario 1 continuedYou’ve downloaded the data from your company database and imported it into a spreadsheet. IMPORTANT: To answer questions using this dataset for the scenario, click the link below and select the “Use Template” button before answering the questions.Link to template: Course Challenge - Scenario 1ORIf you don’t have a Google account, you can download the template directly from the attachment below.Course Challenge Dataset - Scenario 1 - Scenario 1_ Pharmacy Data - Part 1CSV FileNow, it’s time to process the data. As you know, this step involves finding and eliminating errors and inaccuracies that can get in the way of your results. While cleaning the data, you notice that information about Splashtastic is missing for Store Number 15 in Row 16. Which of the following would be an appropriate response?Sort the spreadsheet so the row with missing data is at the bottom.Ask a colleague on your team how they've handled similar issues in the past.Delete the row with the missing data point.Replace the row with the average values of the other data points. Scenario 1 continuedOnce you’ve found the missing information, you analyze your dataset. During analysis, you create a new column F. At the top of the column, you add the attribute Average Percentage of Total Sales - Splashtastic.Fill in the blank: An attribute is a _______ or quality of data used to label a column.numberheadlineresponsecharacteristic Scenario 1 continuedNext, you determine the average total daily sales over the past 12 months at all stores. The entire range of cells that contain these sales are E2:E39. Identify the correct way to write your function.=AVERAGE(E2+E39)=AVERAGE(E2,E39)=AVERAGE(E2:E39)=AVERAGE(E2-E39) Scenario 1 continuedYou’ve reached the share phase of the data analysis process. It involves which of the following? Select all that apply.Present your findings about Splashtastic to stakeholders.Prepare a slideshow about Splashtastic’s sales and practice your presentation.Create a data visualization to highlight the Splashtastic sales insights you've discovered.Stop selling Splashtastic because it doesn't represent a large percentage of total sales. Scenario 2, questions 6-10You’ve been working for the nonprofit National Dental Society (NDS) as a junior data analyst for about two months. The mission of the NDS is to help its members advance the oral health of their patients. NDS members include dentists, hygienists, and dental office support staff.The NDS is passionate about patient health. Part of this involves automatically scheduling follow-up appointments after crown replacement, emergency dental surgery, and extraction procedures. NDS believes the follow-up is an important step to ensure patient recovery and minimize infection.Unfortunately, many patients don’t show up for these appointments, so the NDS wants to create a campaign to help its members learn how to encourage their patients to take follow-up appointments seriously. If successful, this will help the NDS achieve its mission of advancing the oral health of all patients.Your supervisor has just sent you an email saying that you’re doing very well on the team, and he wants to give you some additional responsibility. He describes the issue of many missed follow-up appointments. You are tasked with analyzing data about this problem and presenting your findings using data visualizations.An NDS member with three dental offices in Colorado offers to share its data on missed appointments. So, your supervisor uses a database query to access the dataset from the dental group. The query instructs the database to retrieve all patient information from the member’s three dental offices, located in zip code 81137.The table is dental_data_table, and the column name is zip_code. You have written the following query, but received an error when it ran.SELECT * FROM dental_data_table WHERE dental_data_table = 81137Given the objective of the query, where is the mistake in this query?SELECT, FROM, and WHERE should not be capitalized.In line 2, dental_data_table should be replaced with zip_code 81137.The third line should be WHERE = 81137In line 3, dental_data_table should be replaced with zip_code. Scenario 2 continuedThe dataset your supervisor retrieved and imported into a spreadsheet includes a list of patients, their demographic information, dental procedure types, and whether they attended their follow-up appointment. To use the dataset for this scenario, click the link below and select “Use Template.”Link to template: Course Challenge - Scenario 2ORIf you don’t have a Google account, you can download the template directly from the attachment below.Course Challenge Dataset - Scenario 2CSV FileThe patient demographic information includes data such as age, gender, and home address. When examining the geographic data, you notice that all the patients live in the same zip code.Fill in the blank: The fact that the dataset includes people who all live in the same zip code might get in the way of ______.fairnessaccuracyspreadsheet formulas or functionsdata visualization Scenario 2 continuedAs you’re reviewing the dataset, you notice that there are a disproportionate number of senior citizens. So, you investigate further and find out that this zip code represents a rural community in Colorado with about 800 residents. In addition, there’s a large assisted-living facility in the area. Nearly 300 of the residents in the 81137 zip code live in the facility.You recognize that’s a sizable number, so you want to find out if age has an effect on a patient’s likelihood to attend a follow-up dental appointment. You analyze the data, and your analysis reveals that older people tend to miss follow-ups more than younger people.So, you do some research online and discover that people over the age 60 are 50% more likely to miss dentist appointments. Sometimes this is because they’re on a fixed income. Also, many senior citizens lack transportation to get to and from appointments.With this new knowledge, you write an email to your supervisor expressing your concerns about the dataset. He agrees with your concerns, but he’s also impressed with what you’ve learned and thinks your findings could be very important to the project. He asks you to change the business task. Now, the NDS campaign will be about educating dental offices on the challenges faced by senior citizens and finding ways to help them access quality dental care.The business task has changed. What is the nature of that change?Creating a graphical representation of the dataUsing a database instead of a spreadsheetConducting a gap analysisDefining the new question or problem to be solved Scenario 2 continuedYou continue with your analysis. In the end, your findings support what you discovered during your online research: As people get older, they’re less likely to attend follow-up dental visits.But you’re not done yet. You know that data should be combined with human insights in order to lead to true data-driven decision-making. So, your next step is to share this information with people who are familiar with the problem professionally. They’ll help verify the results of your data analysis.Fill in the blank: The people who are familiar with a problem and help verify the results of data analysis are _____.stakeholderssubject-matter expertscustomersdata scientists Scenario 2 continuedThe subject-matter experts are impressed by your analysis. The team agrees to move to the next step: data visualization. You know it’s important that stakeholders at NDS can quickly and easily understand that older people are less likely to attend important follow-up dental appointments than younger people. This will help them create an effective campaign for members.It’s time to create your presentation to stakeholders. It will include a data visualization that demonstrates the lifetime trend of people being less likely to attend follow-up appointments as they get older.Which type of chart will be most effective?A doughnut chartA tableA pie chartA line chart Scenario 1 continuedYou’ve downloaded the data from your company database and imported it into a spreadsheet. IMPORTANT: To answer questions using this dataset for the scenario, click the link below and select the “Use Template” button before answering the questions.Link to template: Course Challenge - Scenario 1ORIf you don’t have a Google account, you can download the template directly from the attachment below.Course Challenge Dataset - Scenario 1 - Scenario 1_ Pharmacy Data - Part 1CSV FileNow, it’s time to process the data. As you know, this step involves finding and eliminating errors and inaccuracies that can get in the way of your results. While cleaning the data, you notice there’s missing data in one of the rows. What might you do to fix this problem? Select all that apply.Ask a colleague on your team how they've handled similar issues in the pastSort the spreadsheet so the row with missing data is at the bottomDelete the row with the missing data pointAsk you supervisor for guidance Scenario 1 continuedNext, you determine the average total daily sales over the past 12 months at all stores. The entire range of cells that contain these sales are E2:E39. To do this, you use a function. You input =AVE(E2:E39), but this returns an error. What is the correct command?=AVERAGE(E2:E39)=AVERAGE(E2+E39)=AVERAGE(E2,E29)=AVERAGE(E2-E39) Scenario 2, questions 6-10You’ve been working for the nonprofit National Dental Society (NDS) as a junior data analyst for about two months. The mission of the NDS is to help its members advance the oral health of their patients. NDS members include dentists, hygienists, and dental office support staff.The NDS is passionate about patient health. Part of this involves automatically scheduling follow-up appointments after crown replacement, emergency dental surgery, and extraction procedures. NDS believes the follow-up is an important step to ensure patient recovery and minimize infection.Unfortunately, many patients don’t show up for these appointments, so the NDS wants to create a campaign to help its members learn how to encourage their patients to take follow-up appointments seriously. If successful, this will help the NDS achieve its mission of advancing the oral health of all patients.Your supervisor has just sent you an email saying that you’re doing very well on the team, and he wants to give you some additional responsibility. He describes the issue of many missed follow-up appointments. You are tasked with analyzing data about this problem and presenting your findings using data visualizations.An NDS member with three dental offices in Colorado offers to share its data on missed appointments. So, your supervisor uses a database query to access the dataset from the dental group. The query instructs the database to retrieve all patient information from the member’s three dental offices, located in zip code 81137.The table is dental_data_table, and the column name is zip_code. You write the following query.SELECT * FROM dental_data_table WHERE zip code = 81137This query is incorrect. How could it be fixed?In line 3, replace zip code with zip_codeDecapitalize SELECT, FROM, and WHERERewrite line 3 as WHERE_zip code = 81137Rewrite line 3 as zip_code = 81137 Scenario 2 continuedThe dataset your supervisor retrieved and imported into a spreadsheet includes a list of patients, their demographic information, dental procedure types, and whether they attended their follow-up appointment. To use the dataset for this scenario, click the link below and select “Use Template.”Link to template: Course Challenge - Scenario 2ORIf you don’t have a Google account, you can download the template directly from the attachment below.Course Challenge Dataset - Scenario 2CSV FileThe patient demographic information includes data such as age, gender, and home address. You review the demographic data, paying particular attention to geography. What geographic aspect of the data may negatively impact fairness?The patients all live in the same city.The patients all live in houses.The patients all live in the same country.The patients all live in the same zip code.Scenario 2 continuedYou continue with your analysis. In the end, your findings support what you discovered during your online research: As people get older, they’re less likely to attend follow-up dental visits.But you’re not done yet. You know that data should be combined with human insights in order to lead to true data-driven decision-making. So, your next step is to share this information with people who are familiar with the problem professionally. They’ll help verify the results of your data analysis.Fill in the blank: Subject matter experts are people who are familiar with a problem. They can help by identifying inconsistencies in the analysis, _____, and validating the choices being made.redefining the business problemoffering insights into the business problemcreating a presentation with the datacollecting data relevant to the business problem Scenario 2 continuedThe subject-matter experts are impressed by your analysis. The team agrees to move to the next step: data visualization. You know it’s important that stakeholders at NDS can quickly and easily understand that older people are less likely to attend important follow-up dental appointments than younger people. This will help them create an effective campaign for members.It’s time to create your presentation to stakeholders. It will include a data visualization that demonstrates the lifetime trend of people being less likely to attend follow-up appointments as they get older.Fill in the blank: The type of chart that would be most effective in visualizing this is a _____.bar chartpie chartdoughnut chartline chart Course 2 – Ask Questions to Make Data-Driven Decisions Week 1 – Effective questions In structured thinking, why would a data analyst organize the available information?To ask SMART questionsTo recognize the current problem or situationTo summarize results using data visualizationsTo consult with subject matter experts A local internet service provider is expecting an increase in the number of people streaming online entertainment. Their data analyst uses data to estimate the required bandwidth necessary to service its customers. This is an example of which problem type?Discovering connectionsIdentifying themesMaking predictionsSpotting something unusual Fill in the blank: The question, “How could we improve our website to simplify the returns process for our online customers?” is _____-oriented.actionbiaspassivedata Structured thinking involves which of the following processes? Select all that apply.Revealing gaps and opportunitiesRecognizing the current problem or situationOrganizing available informationAsking SMART questions A data analyst creates data visualizations and a slideshow. Which phase of the data analysis process does this describe?PrepareActShareProcess A recycling center that sponsors a podcast about saving the environment is an example of what strategy?Defining the problem to be solvedMaking recommendationsStaying on budgetTrying to reach a target audience A data analyst is working for a local power company. Recently, many new apartments have been built in the community, so the company wants to determine how much electricity it needs to produce for the new residents in the future. A data analyst uses data to help the company make a more informed forecast. This is an example of which problem type?Spotting something unusualDiscovering connectionsMaking predictionsIdentifying themes Describe the key difference between the problem types of categorizing things and identifying themes.Categorizing things involves determining how items are different from each other. Identifying themes brings different items back together in a single group.Categorizing things involves assigning grades to items. Identifying themes involves creating new classifications for items.Categorizing things involves taking inventory of items. Identifying themes deals with creating labels for items.Categorizing things involves assigning items to categories. Identifying themes takes those categories a step further, grouping them into broader themes. Which of the following examples are leading questions? Select all that apply.What do you enjoy most about our service?How did you learn about our company?In what ways did our product meet your needs?How satisfied were you with our customer representative? The question, “Why don’t our employees complete their timesheets each Friday by noon?” is not action-oriented. Which of the following questions are action-oriented and more likely to lead to change? Select all that apply.What functionalities would make our timesheet web page more user-friendly?What features could we add to our calendar app as a weekly timesheet reminder to employees?How could we simplify the time-keeping process for our employees?Why don’t employees prioritize filling out their timesheets by noon on Fridays? On a customer service questionnaire, a data analyst asks, “If you could contact our customer service department via chat, how much valuable time would that save you?” Why is this question unfair?It is closed-endedIt uses slang words that not everyone can understandIt is vagueIt makes assumptions Shuffle Q/AOrganizing available information and revealing gaps and opportunities are part of what process?Identifying connections between two or more thingsCategorizing thingsUsing structured thinkingApplying the SMART methodology The share phase of the data analysis process typically involves which of the following activities? Select all that apply.Summarizing results using data visualizationsCommunicating findingsCreating a slideshow to present to stakeholdersPutting analysis into action to solve a problem A company wants to make more informed decisions regarding next year’s business strategy. An analyst uses data to help identify how things will likely work out in the future. This is an example of which problem type?Making predictionsSpotting something unusualIdentifying themesDiscovering connections Fill in the blank: Categorizing things involves assigning items to categories, whereas _____ takes those categories a step further, grouping them into broader classifications.Making predictionsFinding patternsDiscovering connectionsIdentifying themes Questions that make assumptions often involve concepts that are formed without evidence. An example of this is an idea that is accepted as true without proof.TrueFalse A garden center wants to attract more customers. A data analyst in the marketing department suggests advertising in popular landscaping magazines. This is an example of what practice?Reaching your target audienceCollecting customer informationMonitoring social media feedbackDeveloping a data analytics case study Categorizing things involves assigning items to categories. Identifying themes takes those categories a step further, grouping them into broader themes or classifications.TrueFalse Which of the following examples are closed-ended questions? Select all that apply.Is math your favorite subject?What grade did you get on the math test?How old are you?What are your thoughts about math? The question, “How could we improve our website to simplify the returns process for our online customers?” is action-oriented.TrueFalse Which of the following questions make assumptions? Select all that apply.Keeping employees engaged is important, isn’t it?Wouldn’t you agree that product A is better than product B?Did you get through to customer service?It must be frustrating waiting on hold for so long, right? Structured thinking involves recognizing the current problem or situation you’re facing and identifying your options.TrueFalse Which of the following examples are leading questions? Select all that apply.How satisfied were you with our customer representative?What do you enjoy most about our service?In what ways did our product meet your needs?How did you learn about our company? On a customer service questionnaire, a data analyst asks, “If you could contact our customer service department via chat, how much valuable time would that save you?” Why is this question unfair?It is closed-endedIt uses slang words that not everyone can understandIt is vagueIt makes assumptions Fill in the blank: To apply structured thinking, a data analyst should ______ the available information in order to reveal gaps and opportunities and recognize the current problem or situation.organizecommunicatesharerecord A national chain of sporting goods stores advertises during popular sporting television broadcasts. This is an example of the company doing what?Reaching its target audienceDemonstrating its support for a sports teamDefining the problem to be solvedMonitoring social feedback In data analysis, categorizing things involves which of the following?Creating new classifications for items and assigning grades to itemsAssigning items to categoriesTaking an inventory of itemsDetermining how items are different from each other The question, “Why was the Monday afternoon yoga class successful?” is not measurable. Which of the following questions presents a measurable way to learn about the yoga class?Why do people like taking yoga classes on Mondays?How many customers responded to our recent half-price yoga promotion?Is yoga a great way to stretch and strengthen your body?Do yoga instructors seem more energetic at the beginning of the week? Why should a data analyst only ask fair questions?Unfair questions do not have answers.Unfair questions can provide data that is misleading.Fair questions are biased.Fair questions do not offend people. In the share step of the data analysis process, a data analyst summarizes their results using data visualizations and creates a slideshow to present to stakeholders. What else might they do in this step?Collect data.Communicate findings.Organize the available informationShred paper files. If a cooking supply store wants to attract more customers, where can they advertise to better reach their target audience? Select all that apply.On TV during the season finale of The Best Chef in the UniverseAt a bus stop near a local culinary schoolOn a podcast for foodiesIn a magazine all about advertising Making predictions is one of the six data analytics problem types. How does data factor into such problem types?The data informs the predictions.The data confirms the decisions.The data are the predictions.The predictions validate the data. Which of the following examples are closed-ended questions? Select all that apply.How tall are you?What did you think about the article that I sent you?What is your opinion of the new movie?Have you taken this class before? What is the defining characteristic of measurable questions?They are questions that have numbers in them.Their answers are numbers that can be interpreted qualitatively.They are questions that use numbers as categories.Their answers are numbers that can be interpreted mathematically. Fill in the blank: “How many people filled out the survey?” is an example of a question that is _____ in the context of data analysis.categoricalsymbolicmeasureablequalitative Week 2 – Data-driven decisions An analyst is working with data from two school programs. They discover that the data is measured differently across programs and this may impact how they can work with the data. What does this example describe?Data-inspired decision-makingData-driven decision-makingThe limitations of working with dataData that cannot be analyzed A retail store runs a special sale with the goal of increasing sales over the holiday season. They use the increase in sales over the same month last year as a starting point. What type of goal is this an example of?Metric goalTheoretical goalFinite goalConceptual goal A data analyst assesses how well their company’s marketing campaign is performing. They apply a formula that compares the cost of the campaign and its net profit. What does this formula measure?The return on investmentTotal revenueThe average costTotal cost Which of the following statements describes an algorithm?A process or set of rules to be followed for a specific taskA method for recognizing the current problem or situation and identifying the optionsA tool that enables data analysts to spot something unusualA technique for focusing on a single topic or a few closely related ideas Fill in the blank: If a data analyst is measuring qualities and characteristics, they are considering _____ data.quantitativeunbiasedcleanedqualitative In data analytics, reports use live, incoming data from multiple datasets; dashboards use static collections of data.TrueFalse A pivot table is a data-summarization tool used in data processing. Which of the following tasks can pivot tables perform? Select all that apply.Group dataClean dataCalculate totals from dataReorganize data A metric is a single, quantifiable type of data that can be used for what task?Setting and evaluating goalsDefining a problem typeCleaning dataSorting and filtering data Which of the following options describes a metric goal? Select all that apply.Evaluated using metricsIndefiniteMeasurableBased on theory Fill in the blank: Return on investment compares the _____ of an investment to the net profit gained from that investment.successpurposecosttiming Fill in the blank: A data analyst is using data to address a large-scale problem. This type of analysis would most likely require _____. Select all that apply.small datadata that reflects change over timedata represented by a limited number of metricsbig data Shuffle Q/AFill in the blank: In data analytics, qualitative data _____. Select all that apply.is always time boundmeasures qualities and characteristicsis subjectivemeasures numerical facts Fill in the blank: A _____ is a data-summarization tool used to sort, reorganize, group, count, total, or average data.reportdashboardfunctionpivot table Fill in the blank: A _____ goal is measurable and evaluated using single, quantifiable data.metricfiniteconceptualbenchmark Describe the main differences between big and small data.Small data is typically stored and organized in databases. Big data is typically stored and organized in spreadsheets.Small data is less useful to data analysts. Big data is more useful to data analysts.Small data is specific and concerns a short time period. Big data is less specific and concerns a longer time period.Small data has been cleaned and sorted. Big data has not yet been cleaned or sorted. In data analytics, a pattern is defined as a process or set of rules to be followed for a specific task.TrueFalse In data analytics, quantitative data measures qualities and characteristics.TrueFalse In data analytics, reports use data that doesn’t change once it’s been recorded. Which of the following terms describes this type of data?ComprehensiveReal-timeMonitoredStatic Which data-summarization tool do data analysts use to sort, reorganize, group, count, total, or average data?A functionA pivot tableA dashboardA report A metric is a specific type of data that companies use to identify a problem domain.TrueFalse Fill in the blank: A metric goal is a _____ goal set by a company that is evaluated using metrics.finitetheoreticalconceptualmeasurable A data analyst is using data from a short time period to solve a problem related to someone’s day-to-day decisions. They are most likely working with small data.TrueFalse If a data analyst compares the cost of an investment to the net profit of that investment over a period of time, they’re analyzing the investment scope.TrueFalse What is an example of using a metric? Select all that apply.Using column headers to sort and filter dataUsing annual profit targets to set and evaluate goalsUsing key performance indicators, such as click-through rates, to measure revenueUsing a pie chart to visualize data Fill in the blank: In data analytics, a process or set of rules to be followed for a specific task is _____.a patterna domainan algorithma value Fill in the blank: Return on investment compares the cost of an investment to the _____ of that investment.purposetimingnet profitfuture success Week 3 – More spreadsheet basics What calculations can you carry out within a spreadsheet? Select all that apply.MinimumMaximumCopyingAverage What are some of the ways that data analysts can gather data? Select all that apply.Use data received from a colleagueUse data they collect themselvesUse data from open source locationsUse restricted data from the government You sum the entries in cells F3 through F200 in your spreadsheet. What is the correct function for this?=SUM(F3+F200)=SUM(F3;F200)=SUM(F3,F200)=SUM(F3:F200) What are some of the causes of bias in data analytics? Select all that apply.Cultural differencesSocial normsMultiple perspectivesServing an agenda Fill in the blank: In spreadsheets, data analysts begin _____ with an equal sign (=).cellsnumbersformulascharts Fill in the blank: The labels that describe the type of data contained in each column of a spreadsheet are called _____.assignmentsattributesallowancesaspects Which of the following tasks might be performed using spreadsheets?Maintain information about accountsWrite a sales pitchDevelop communication skillsLand a new client Formulas are created by the user, whereas functions are preset commands in spreadsheets.TrueFalse In the function =MAX(B5:B15), what does B5:B15 represent?ObservationColumnAttributeRange What is the correct spreadsheet formula for multiplying cell H2 times cell H5?=H2/H5=H2^H5=H2*H5=H2xH5 To avoid bias when collecting data, a data analyst should keep what in mind?ContextOpinionStakeholdersGraphs A data analyst might use descriptive column headers in order to achieve what goal?Add context to their dataProtect the spreadsheetAlphabetize the spreadsheet dataFilter the data Shuffle Q/ATo determine an organization’s annual budget, a data analyst might use a slideshow.TrueFalse Which of the following are ways that data analysts can add context to their data? Select all that apply.Use descriptive column headersConsider where the data came fromCreate reports for stakeholdersAsk questions about the data In spreadsheets, formulas and functions end with an equal sign (=).TrueFalse A data analyst could use spreadsheets to achieve which of the following tasks?Motivate employeesWrite reportsBuild code for a new appPredict next quarter’s sales In the function =MAX(G3:G13), what does G3:G13 represent?an attributean observationThe rangea table What is the correct spreadsheet formula for multiplying cell D5 times cell D7?=D5xD7=D5^D7=D5*D7=D5/D7 Fill in the blank: A data analyst considers which organization created, collected, or funded a dataset in order to understand its _____.structuredetaillengthcontext Which of the following statements accurately describe formulas and functions? Select all that apply.Formulas are instructions that perform specific calculations.Formulas may only be used once per spreadsheet column.Functions are preset commands that perform calculations.Formulas and functions assist data analysts in calculations, both simple and complex. In the function =MAX(B5:B15), what does B5:B15 represent?AttributeColumnObservationRange What is the correct spreadsheet formula for multiplying cell H2 times cell H5?=H2*H5=H2/H5=H2xH5=H2^H5 Both formulas and functions in spreadsheets begin with what symbol?Equal sign (=)Colon (:)Hyphen (-)Bracket ([) Fill in the blank: By negatively influencing data collection, ____ can have a detrimental effect on analysis.objectivitybiaspartialityfiltering Attributes are used in spreadsheets for what purpose?Analyze the data in a rowInsert data into each columnAdd a new columnLabel the data in each column To determine an organization’s annual budget, a data analyst might use a slideshow.TrueFalse Which of the following statements describes a key difference between formulas and functions?Formulas contain words and numbers, and functions contain numbers only.Formulas span two or more cells, and functions exist in only one cell.Formulas are used in graphs, and functions are not.Formulas are written by the user, and functions are already defined. What do data analysts use to label the type of data contained in each column in a spreadsheet?TablesMenusAttributesHeadings In the function =MAX(A1:A12), what does A1:A12 represent?The rangeThe operatorThe maximumThe formula Fill in the blank: Putting data into context helps data analysts eliminate _____.labelsintolerancebiasfairness Week 4 – Always remember the stakeholder Fill in the blank: Your data analytics team is working on a project for the marketing department. The person most likely to be the _____ stakeholder is the vice president of marketing.primarynecessarysecondaryproject To communicate clearly with stakeholders and team members, there are four key questions data analysts ask themselves. One of the questions is: What does my audience already know? Identify the remaining three questions. Select all that apply.What does my audience need to know?How can I communicate effectively to my audience?Why are stakeholders and team members important?Who is my audience? You accept a new project from a high level stakeholder. After beginning the project, you find that you aren’t sure what you are supposed to do. How do you handle this?Determine the objectives that make the most sense and work towards those.Set up a meeting with the stakeholder to discuss the specific objectives they wanted.Ask a member of your team what was done on the last project and do the same.Perform the standard analysis and present its insights. A data analyst collects a large amount of data for their project to ensure that the data represents a diverse set of perspectives. What element of data collection does this describe?Sample sizeStatistical significanceVisualizationData cleaning When leading a meeting, it is important to respect your team members’ time. What are some ways of doing this? Select all that apply.Pay attention to what others are sayingArrive to the meeting on timeDiscuss work that does not impact the attendees.Be prepared to talk about your work What are some of the “don’ts” when attending a meeting?Don't dominate the conversation.Don't show up unprepared.Don’t arrive late.Don’t arrive early. Your manager assigns you a project task, and you don’t understand the point of the project. What questions can you ask them to determine the objective? Select all that apply.What is their end goal?What do you have to do for this task?What is the story they want to tell?What is the big picture? A data analyst starts a new project for the operations team at their company. They take a few hours at the beginning of the project to identify their stakeholders. The secondary stakeholders are most likely which of the following people? Select all that apply.The data analystThe project managerThe president of the companyThe vice president of operations A data analyst is researching the buying behavior of people who shop at a company’s retail store and those who might shop there in the future. During the analysis, it will be important to stay in communication with the people who most often interact with these shoppers. They are members of the executive team.TrueFalse There are four key questions data analysts ask themselves: Who is my audience? What do they already know? What do they need to know? And how can I communicate effectively with them? These questions enable data analysts to achieve what goal?Understand who is managing the dataCommunicate clearly with stakeholders and team membersIdentify primary and secondary stakeholdersComplete data analysis projects on time Data analysts pay attention to sample size in order to achieve what goals? Select all that apply.To fully understand the scope of the analytics projectTo avoid a small sample size leading to inaccurate judgementsTo make sure the data represents a diverse set of perspectivesTo make sure a few unusual responses don’t skew results A data analyst receives an email from the vice president of marketing. The vice president is upset because the report they want from the analyst is late. Select the best course of action.The analyst should call the vice president and ask them how important it really is to their marketing efforts.The analyst should send the report immediately, even if it’s not completely finished. This will make the vice president happy.The analyst should respond saying they understand the vice president’s concerns, provide a status update, and let the vice president know when to expect the completed report.The analyst should apologize for the delay and inform the vice president that the marketing managers caused the delay. Arriving at meetings prepared is an important part of creating a professional work environment. This involves which of the following actions? Select all that apply.Bringing materials to take notes withConsidering what questions you may be asked so you’re prepared to answerReading the meeting agenda ahead of timeBringing a laptop to keep an eye on emails A data analyst joins an online meeting on time. After reviewing the agenda, they see that their project comes at the very end. They’re extremely busy and can use this time to stay on top of their current projects. How should they proceed?Mute themselves and turn off the camera, then continue working on other tasks until their project is mentioned.Tell the participants that they’re having technical trouble, then leave the meeting to continue working on other tasks.Politely let the presenter know they’re going to leave the meeting and rejoin toward the end.Stay focused and attentive during the entire meeting. Even though some items on the agenda don’t affect their projects, they could still learn something or have something to contribute. Your data analytics team has been working on a project for a few weeks. You’re almost done, when your supervisor suddenly changes the business task. Everyone has to start all over again. You announce to the team that you’re going to say something to the supervisor about how unreasonable this is. What’s the best next step?Insist that the entire data analytics team complain to your supervisor.Go see your supervisor face-to-face and tell them why you’re so upset.Write a polite, but strongly worded email to your supervisor.Take a few minutes to calm down, then ask your colleagues to share their perspectives so you can work together to determine the best next step. Shuffle Q/AA data analyst is researching the buying behavior of people who shop at a company’s retail store and those who might shop there in the future. During the analysis, it will be important to stay in communication with the team that most often interacts with these shoppers. What is the name of this team?Data science teamProject management teamExecutive teamCustomer-facing team You receive an angry email from a colleague on the marketing team. The marketing colleague believes you have taken credit for their work. You do not believe this is true. Select the best course of action.Delete the email. It’s best not to create any additional conflict.Reply to the email, asking if they can schedule a time to talk about this in person in order to allow both of you to share your perspectives.Walk over to the marketing colleague’s cubicle, and tell them you strongly disagree.Forward the email to the marketing director with an equally angry note. A data analyst has been invited to a meeting. They review the agenda and notice that their data analysis project is one of the topics that will be discussed. How can they prepare for an effective meeting? Select all that apply.Bring materials for taking notes.Plan to arrive on time.Think about what project updates they should share.Create and share a revised agenda that includes many more details about their project. Which of the following steps are key to leading a professional online meeting? Select all that apply.Maintaining control of the meeting by keeping everyone else on mute.Sitting in a quiet area that’s free of distractionsMaking sure your technology is working properly before starting the meetingKeeping an eye on your inbox during the meeting in case of an important email A team member has asked you to take on a task, and you don’t understand the point of the project. It seems like it will be a waste of your time. The best course of action would be to politely explain your concerns and decline the project.TrueFalse Fill in the blank: A data analytics team is working on a project to measure the success of a company’s new financial strategy. The vice president of finance is most likely to be the _____.project manageranalystprimary stakeholdersecondary stakeholder At an online marketplace, the _____ includes anyone in an organization who interacts with current or potential shoppers.executive teamdata science teamproject management teamcustomer-facing team There are four key questions data analysts ask themselves: Who is my audience? What do they already know? What do they need to know? And how can I communicate effectively with them? These questions enable data analysts to identify the person in charge of managing the data.TrueFalse A data analyst has been invited to a meeting. They review the agenda and notice that their data analysis project is one of the topics that will be discussed. They plan to arrive on time and have a pen and paper to take notes. But they do not spend time considering project updates they could share or questions they may be asked. This is appropriate because they’re not the one running the meeting.TrueFalse A data analytics team is working on a project to measure the success of a company’s new financial strategy. Select the person most likely to be the primary stakeholder for this project.The project managerThe data analystThe vice president of financeThe director of analytics To communicate clearly with stakeholders and team members, there are four key questions data analysts ask themselves. One of them is: What does my audience need to know? Identify the remaining three questions. Select all that apply.Why are stakeholders and team members important?Who is my audience?How can I communicate effectively to my audience?What does my audience already know? Conflict is a natural part of working on a team. What are some ways to help shift a situation from problematic to productive? Select all that apply.Take a moment to check your emotions before engaging in an argument.Ask for a conversation to help you better understand the big picture.Reframe the question by asking, “How can I help?”Identify the person who caused the issue so they can take responsibility. Data analysts focus on statistical significance to make sure they have enough data so that a few unusual responses don’t skew resultsTrueFalse A data analyst feels overworked. They often stay late to finish work, and have started missing deadlines. Their supervisor emails them another project to complete, and this causes the analyst even more stress. How should they handle this situation?Accept the new project right away and hope to not miss another deadline.Wait a few minutes to think it over, then respond with a meeting request to discuss this project and the general workload.Walk into the supervisor’s office and tell them to give the project to someone else.Respond immediately, letting the supervisor know the expectations at this company are unreasonable. When participating in an online meeting, it’s okay to keep your inbox open in another browser window. Participants won’t be distracted because they can’t see it, and you might receive a very important message.TrueFalse Course challenge Scenario 1, questions 1-5You’ve just started a job as a data analyst at a small software company that provides data analytics and business intelligence solutions. Your supervisor asks you to kick off a project with a new client, Athena’s Story, a feminist bookstore. They have four existing locations, and the fifth shop has just opened in your community.Athena’s Story wants to produce a campaign to generate excitement for an upcoming celebration and introduce the bookstore to the community. They share some data with your team to help make the event as successful as possible.Your task is to review the assignment and the available data, then present your approach to your supervisor. Click the link below to access the email from your supervisor:Course 2 Scenario 1 Email from Supervisor.pdfPDF FileThen, review the email, and the Customer Survey and Historical Sales datasets.To use the templates for the datasets, click the links below and select “Use Template.”Links to templates: Customer Survey and Historical SalesORIf you don't have a Google account, you can download the CSV files directly from the attachments below.CustomerSurvey - CustomerSurveyCSV FileHistoricalSales - HistoricalSalesCSV FileAfter reading the email, you notice that the acronym WHM appears in multiple places. You look it up online, and the most common result is web host manager. That doesn’t seem right to you, as it doesn’t fit the context of a feminist bookstore. Still, you should assume it’s correct and continue with the project.TrueFalseScenario 1 continuedNow that you know WHM stands for Women’s History Month, you continue reviewing the datasets. You notice that the Customer Survey dataset contains both qualitative and quantitative data.To use the template for the dataset, click the link below and select “Use Template.”Link to template: Customer SurveyORIf you don't have a Google account, you can download the CSV file directly from the attachment below.CustomerSurvey - CustomerSurveyCSV FileThe qualitative data includes information from which columns? Select all that apply.Column E (Survey Q5: What do you like most about Athena's Story?)Column B (Survey Q2: If answered "Yes" to Q1, how do you plan to celebrate?)Column D (Survey Q4: If answered "Yes" to Q3, how many books do you typically purchase during March?)Column F (Survey Q6: What types of books would you like to see more of at Athena's Story?) Scenario 1 continuedNext, you review the customer feedback in column F of the Customer Survey dataset.To use the template for the dataset, click the link below and select “Use Template.”Link to template: Customer SurveyORIf you don't have a Google account, you can download the CSV file directly from the attachment below.CustomerSurvey - CustomerSurveyCSV FileThe attribute of column F is, “Survey Q6: What types of books would you like to see more of at Athena's Story?” In order to verify that children’s literature and feminist zines are among the most popular genres, you create a visualization. This will help you clearly identify which genres are most likely to sell well during the Women’s History Month campaign.Your visualization looks like this:Pie chart categories: -Feminist science fiction 4.8% -Books about women 2.4% -Women's journals 2.4% -Feminist literary criticism 2.4% -Children's literature 15.5% -Women's history books 2.4% -Biographies of inspiration 20.2% -Feminist fiction 26.2% -Feminist zines 14.3% -Feminist poetry 4.6% -Feminist novels 3.6%Pie chart categories: Feminist science fiction 4.8% Books about women 2.4% Women's journals 2.4% Feminist literary criticism 2.4% Children's literature 15.5% Women's history books 2.4% Biographies of inspiration 20.2% Feminist fiction 26.2% Feminist zines 14.3% Feminist poetry 4.6% Feminist novels 3.6%Fill in the blank: The visualization you create demonstrates the percentages of each book genre that make up the total number of survey responses. It’s called a _____ chart.bubblepiedoughnutarea Now that you’ve confirmed that children’s literature and feminist zines are among the most requested book genres, you review the Historical Sales dataset.To use the template for the dataset, click the link below and select “Use Template.”Link to template: Historical SalesIf you don't have a Google account, you can download the CSV file directly from the attachment below.HistoricalSales - HistoricalSalesCSV FileYou’re pleased to see that the dataset contains data that’s specific to children’s literature and feminist zines. This will provide you with the information you need to make data-inspired decisions. In addition, the children’s literature and feminist zines metrics will help you organize and analyze the data about each genre in order to determine if they’re likely to be profitable.Next, you calculate the total sales over 52 weeks for feminist zines. You type =CALCULATE(E2-E53) but get an error. What is the correct syntax?=MAX(E2:E53)=COUNT(E2:E53)=SUM(E2:E53)=CALC(E2:E53) Scenario 1 continuedAfter familiarizing yourself with the project and available data, you present your approach to your supervisor. You provide a scope of work, which includes important details, a schedule, and information on how you plan to prepare and validate the data. You also share some of your initial results and the pie chart you created.In addition, you identify the problem type, or domain, for the data analysis project. You decide that the historical sales data can be used to provide insights into the types of books that will sell best during Women’s History Month this coming year. This will also enable you to determine if Athena’s Story should begin selling more children’s literature and feminist zines.Using historical data to make informed decisions about how things may be in the future is an example of spotting something unusual.TrueFalse Scenario 2, questions 6-10You’ve completed this program and are now interviewing for your first junior data analyst position. You’re hoping to be hired by an event planning company, Patel Events Plus. Access the job description below:Junior Data Analyst Job Description.pdfPDF FileSo far, you’ve successfully completed the first round of interviews with the human resources manager and director of data and strategy. Now, the vice president of data and strategy wants to learn more about your approach to managing projects and clients. Access the email you receive from the human resources director below:Human Resources Director Email.pdfPDF FileYou arrive Thursday at 1:45 PM for your 2 PM interview. Soon, you’re taken into the office of Mila Aronowicz, vice president of data and strategy. After welcoming you, she begins the behavioral interview.First, she hands you a copy of Patel Events Plus’s organizational chart. Access the chart below:Patel Event Plus Org Chart.pdfPDF FileAs you’ve learned in this course, stakeholders are people who invest time, interest, and resources into the projects you’ll be working on as a data analyst. Let’s say you’re working on a project involving data and strategy.Based on what you find in the organizational chart, who should be considered the primary stakeholder for projects involving data and strategy?DirectorProject managerChief executive officerVice president Scenario 2 continuedNext, the vice president wants to understand your knowledge about asking effective questions. Consider and respond to the following question. Select all that apply.Let’s say we just completed a big event for a client and wanted to find out if they were satisfied with their experience. Provide some examples of measurable questions that you could include in the customer feedback survey. Select all that apply.Would you recommend Patel Events Plus to a colleague or friend? Yes or no?Why did you enjoy the event planned by Patel Events Plus?On a scale from 1 to 5, with 1 being not at all likely and 5 being very likely, how likely are you to recommend Patel Events Plus?How would you describe your event experience? Scenario 2 continuedNow, the vice president presents a situation having to do with resolving challenges and meeting stakeholder expectations. Consider and respond to the following question.You’re working on a rush project, and you discover your dataset is not clean. Even though it has numerous nulls, redundant data, and other issues, the primary stakeholder insists that you move ahead and use it anyway. The project timeline is so tight that there simply isn’t enough time for cleaning. How would you handle that situation?Contact the stakeholder’s boss to let them know about the issue and ask for help managing the stakeholder’s expectations.The stakeholder is in charge. It's best to do as they say and use the unclean dataset.Clean the data as quickly as you can. It’s not perfect, but it’s better than it was before, and this way you can meet the deadline.Communicate the situation to your supervisor and ask for advice on how to handle the situation with the stakeholder. Scenario 2 continuedYour next interview question deals with sharing information with stakeholders. Consider and respond to the following question. Select all that apply.Let’s say you’ve designed a dashboard to give stakeholders easy, automatic access to data about an upcoming event. Describe the benefits of using a dashboard. Select all that apply.Dashboards offer live monitoring of incoming data.Dashboards enable stakeholders to interact with the data.Dashboards are easy to design and understand.Dashboards present pre-cleaned, historical data. Scenario 2 continuedYour final behavioral interview question involves using metrics to answer business questions. Your interviewer hands you a copy of a Patel Events dataset.To use the template for this dataset, click the link below and select “Use Template.”Link to template: Patel Events DataORIf you don't have a Google account, you can download the CSV file directly from the attachment below.Patel Events Plus datasetCSV FileThen, she asks: Recently, Patel Events Plus purchased a new venue for our events. If we asked you to calculate the return on investment of this purchase, the metrics to consider would be the cost of the investment and what else?Net profit in 2019Average event revenues2019 events held at new venuePurchase date Scenario 1, questions 1-5You’ve just started a job as a data analyst at a small software company that provides data analytics and business intelligence solutions. Your supervisor asks you to kick off a project with a new client, Athena’s Story, a feminist bookstore. They have four existing locations, and the fifth shop has just opened in your community.Athena’s Story wants to produce a campaign to generate excitement for an upcoming celebration and introduce the bookstore to the community. They share some data with your team to help make the event as successful as possible.Your task is to review the assignment and the available data, then present your approach to your supervisor. Click the link below to access the email from your supervisor:Course 2 Scenario 1 Email from Supervisor.pdfPDF FileThen, review the email, and the Customer Survey and Historical Sales datasets.To use the templates for the datasets, click the links below and select “Use Template.”Links to templates: Customer Survey and Historical SalesORIf you don't have a Google account, you can download the CSV files directly from the attachments below.CustomerSurvey - CustomerSurveyCSV FileHistoricalSales - HistoricalSalesCSV FileAfter reading the email, you notice that the acronym WHM appears in multiple places. You look it up online, and the most common result is web host manager. That doesn’t seem right to you, as it doesn’t fit the context of a feminist bookstore. You email your supervisor to ask. When writing your email, what do you do to ensure it sounds professional? Select all that apply.Respect your supervisor’s time by writing an email that’s short and to the point.Use a polite greeting and closing.Read your email aloud before sending to catch any typos or grammatical errors and to ensure the communication is clear.Write a clear subject line that gets a fast response so you can keep working: “WHM? NEED TO KNOW WHAT THAT IS RIGHT AWAY.” Scenario 1 continuedNow that you know WHM stands for Women’s History Month, you continue reviewing the datasets. You notice that the Customer Survey dataset contains both qualitative and quantitative data.To use the template for the dataset, click the link below and select “Use Template.”Link to template: Customer SurveyORIf you don't have a Google account, you can download the CSV file directly from the attachment below.CustomerSurvey - CustomerSurveyCSV FileThe quantitative data includes information from which columns? Select all that apply.Column D (Survey Q4: If answered "Yes" to Q3, how many books do you typically purchase during March?)Column C (Survey Q3: Do you purchase feminist books in honor of WHM, either for yourself or as a gift for someone else?)Column E (Survey Q5: What do you like most about Athena's Story?)Column A (Survey Q1: Do you plan to celebrate WHM?) Scenario 2, questions 6-10You’ve completed this program and are now interviewing for your first junior data analyst position. You’re hoping to be hired by an event planning company, Patel Events Plus. Access the job description below:Junior Data Analyst Job Description.pdfPDF FileSo far, you’ve successfully completed the first round of interviews with the human resources manager and director of data and strategy. Now, the vice president of data and strategy wants to learn more about your approach to managing projects and clients. Access the email you receive from the human resources director below:Human Resources Director Email.pdfPDF FileYou arrive Thursday at 1:45 PM for your 2 PM interview. Soon, you’re taken into the office of Mila Aronowicz, vice president of data and strategy. After welcoming you, she begins the behavioral interview.First, she hands you a copy of Patel Events Plus’s organizational chart. Access the chart below:Patel Event Plus Org Chart.pdfPDF FileAs you’ve learned in this course, stakeholders are people who invest time, interest, and resources into the projects you’ll be working on as a data analyst. Let’s say you’re working on a project involving data and strategy.Based on what you find in the organizational chart, which individuals are considered the secondary stakeholders? Select all that apply.Project manager, analyticsData analytics coordinatorDirector, data analyticsChief executive officer Scenario 2 continuedNext, the vice president wants to understand your knowledge about asking effective questions. Consider and respond to the following question. Select all that apply.Let’s say we just completed a big event for a client and wanted to find out if they were satisfied with their experience. Provide some examples of measurable questions that you could include in the customer feedback survey. Select all that apply.How would you rate your overall experience — poor, average, above average, or excellent?Why did our event options and features create a successful event?Was this your first time using Patel Events Plus to plan your event? Yes or no?Did you experience any problems with your event? Yes or no? Now, the vice president presents a situation having to do with resolving challenges and meeting stakeholder expectations. Consider and respond to the following question. Select all that apply.You’re working with a dataset that the data analytics coordinator should have cleaned, but it turns out that it wasn’t. Your supervisor thought the dataset was ready for use, but you discover nulls, redundant data, and other issues. The project is due in less than two weeks. Which of the following options would be an appropriate approach? Select all that apply.Proceed with the project using the available data. You don’t want to get the associate data analyst in trouble, and you don’t want to miss your deadline.Email the data analytics coordinator to ask if the two of you can work together to clean the data, as the project is on a tight timeline.Provide your supervisor with a proposed revised timeline. Politely explain that you need some additional time to clean the data.Email your supervisor and the data analytics coordinator to communicate about the issue. Ask if you can meet to come up with a solution. Scenario 2 continuedYour next interview question deals with sharing information with stakeholders. Consider and respond to the following question. Select all that apply.Let’s say you’ve created a report to present stakeholders with information about an upcoming event. Describe the benefits of using a report. Select all that apply.Reports enable stakeholders to interact with the data.Reports offer live monitoring of incoming data.Reports reflect data that’s already been cleaned and sorted.Reports provide a snapshot of high-level, historical data. Course 3 – Prepare Data for Exploration Week 1 – Data types and structures A data analyst is preparing an annual report for company executives and decides to use internal data. Why do they choose to use internal data? Select all that apply.Internal data is easier to collect.Internal data is less likely to need cleaning.Internal data is more reliable.Internal data is less vulnerable to biased collection. A data analyst is reviewing data that has been organized into a table format. What type of data is in the table?Unstructured dataInternal dataExternal dataStructured data A data analyst is reviewing a spreadsheet. They find that the columns contain the data variables. What data format does this describe?Tall dataWide dataShort dataNarrow data A data analyst at a book publisher is working on an urgent report for executives. They are using only historical data. What is the most likely reason for choosing to analyze only historical data?The project has a very short time frameThe data is unknownThere is plenty of time to research historical dataThe data is constantly changing Which of the following are examples of discrete data? Select all that apply.Box office returnsMovie running timeMovie budgetNumber of actors in movie Which of the following questions collects nominal qualitative data?Is this your first time dining at this restaurant?How many people do you usually dine with?How many times have you dined at this restaurant?On a scale of 1-10, how would you rate your service today? Why is internal data considered more reliable and easier to collect than external data?Internal data circumvents privacy restrictions.Internal data comes from people you know.Internal data has much larger sample sizes.Internal data lives within a company’s own systems. A social media post is an example of structured data.TrueFalse Fill in the blank: A Boolean data type can have _____ possible values.three10twoinfinite The following is a selection from a spreadsheet:What kind of data format does it contain?ShortWideNarrowLong A data analyst is working in a spreadsheet application. They use Save As to change the file type from .XLS to .CSV. This is an example of a data transformation.TrueFalse Shuffle Q/AA data analyst is working on an urgent traffic study. As a result of the short time frame, which type of data are they most likely to use?TheoreticalHistoricalPersonalUnclean Nominal qualitative data has a set order or scale.TrueFalse Internal data is more reliable because it’s clean.TrueFalse Structured data is likely to be found in which of the following formats? Select all that apply.Audio fileDigital photoSpreadsheetTable A Boolean data type must have a numeric value.TrueFalse In long data, separate columns contain the values and the context for the values, respectively. What does each column contain in wide data?A specific constraintA specific data typeA unique data variableA unique format Fill in the blank: Data transformation enables data analysts to change the _____ of the data.valuestructureaccuracymeaning Continuous data is measured and has a limited number of values.TrueFalse Which of the following values are examples of a Boolean data type? Select all that apply.True or falseYes, no, or unsureYes or noOne, two, or three If you have a short time frame for data collection and need an answer immediately, you likely will have to use historical data.TrueFalse Which of the following is an example of continuous data?Leading actors in movieBox office returnsMovie run timeMovie budget Which of the following questions collect nominal qualitative data? Select all that apply.How likely are you to recommend this restaurant to a friend?Is this your first time dining at this restaurant?Have you heard of our frequent diner program?Did anyone recommend our restaurant to you today? Data transformation can change the structure of the data. An example of this is taking data stored in one format and converting it to another.TrueFalse Which of the following is a benefit of internal data?Internal data is less vulnerable to biased collection.Internal data is the only data relevant to the problem.Internal data is less likely to need cleaning.Internal data is more reliable and easier to collect. Week 2 – Bias, credibility, privacy, ethics, and access Which of the following best describes data bias?It is a preference in the data in favor of or against a person, group, or thing.It is a measure of how closely the data represent the population.It refers to how consistent the data is over time as new data is added.It is the tendency for the data to remain accurate for longer. In data ethics, consent gives an individual the right to know the answers to which of the following questions? Select all that apply.How will my data be used?How long will my data be stored?Why am I being forced to share my data?Why is my data being collected? An individual who provides their data has the right to know and understand all of the data-processing activities and algorithms used on that data. This concept refers to which aspect of data ethics?OwnershipCurrencyTransaction transparencyConsent A company collects and analyzes user data. As part of this process, they preserve each data subject’s information and activity for all data transactions. What data ethics concept does this describe?ConsentPrivacyTransformationTransparency Fill in the blank: A preference in favor of or against a person, group of people, or thing is called _____. It is an error in data analytics that can systematically skew results in a certain direction.data collectiondata interoperabilitydata biasdata anonymization Which type of bias is the tendency to always construe ambiguous situations in a positive or negative way?ObserverConfirmationSamplingInterpretation Which of the following are qualities of unreliable data? Select all that apply.BiasedInaccurateVettedIncomplete Fill in the blank: Data _____ refers to well-founded standards of right and wrong that dictate how data is collected, shared, and used.ethicsprivacycredibilityanonymization Ownership is a key issue in data ethics. Who owns data?The organization that invests time and money collecting, processing, and analyzing the dataThe government that passes data-protection legislationThe individual who originally generates the dataThe law enforcement agencies that enforce data protection laws An employer accesses an employee’s credit report without their consent. This is not a violation of the employee’s privacy because they work at the company.TrueFalse What is the process of protecting people’s private or sensitive data by eliminating identifying information?Data governanceData designData ethicsData anonymization A key aspect of open data is free access to people’s personal information.TrueFalse Shuffle Q/AA clinic surveys a group of male and female patients about their experience with physical therapy. The survey does not include people with disabilities. Is the survey data biased?YesNo A university surveys its student-athletes about their experience in college sports. The survey only includes student-athletes with scholarships. What type of bias is this an example of?Interpretation biasObserver biasConfirmation biasSampling bias An individual who provides their data has the right to know and understand all of the data-processing activities and algorithms used on that data. This is called ownership.TrueFalse The right to inspect, update, or correct your own data is part of which aspect of data ethics?Data opennessData ownershipData consentData privacy Interoperability is key to open data’s success. Which of the following is an example of interoperability?A website charges a fee to access a databaseAn analyst removes all personally identifiable information from a databaseDifferent databases use common formats and terminologyA company restricts the use of a database to its own employees Which of the following situations are examples of bias? Select all that apply.A researcher who surveys a sample group that is representative of the populationA scholar who only reads sources that support their argumentA dancing competition judge who is a close friend of the dancer who wins the competitionA daycare that won’t hire men for childcare positions Which of the following “C’s” describe qualities of good data? Select all that apply.ComprehensiveCitedCurrentConsequential If a company uses your personal data as part of a financial transaction, you should be made aware of the nature and scale of the transaction. What concept of data ethics does this refer to?PrivacyCurrencyOwnershipConsent Data anonymization applies to both text and images.TrueFalse The government of a large city collects data on the quality of the city’s infrastructure. Any business, nonprofit organization, or person can access the government’s databases and re-use or redistribute the data. Is this an example of open data?YesNo Which of the following are types of data bias often encountered in data analytics? Select all that apply.Observer biasInterpretation biasEducational biasConfirmation bias In general, the usefulness of data decreases as time passes.TrueFalse Ownership is a key issue in data ethics. Who owns data?The law enforcement agencies that enforce data protection lawsThe organization that invests time and money collecting, processing, and analyzing the dataThe individual who originally generates the dataThe government that passes data-protection legislation Which of the following are commonly used methods for anonymizing data? Select all that apply.MaskingHashingDeletingBlanking Week 3 – Databases: Where data lives Which of the following properties describe primary keys in a relational database? Select all that apply.They are used to ensure data in a specific column is unique.They refer to another primary key in a different table.There can be multiple primary keys in a table.There can only be one primary key in a table. What do metadata repositories do to make it simpler and quicker to use multiple data sources for analysis? Select all that apply.Keep metadata in a common structureKeep the metadata in an accessible formStore the related data assetsDescribe where data came from Which type of metadata is used to indicate where a digital asset or piece of information originated from?StructuralAdministrativeDescriptiveGeneral What is the process that data analysts use to ensure the formal management of their company’s data assets?Data integrityData aggregationData mappingData governance What are some of the reasons for open data initiatives? Select all that apply.To educate citizens about local issuesTo make government activities more transparentTo increase protection of proprietary dataTo give people ways to provide feedback to the government A nonprofit has a list of their many donors. They want to send a mailing to donors who live within 100 miles of the nonprofit’s headquarters. How could they use the column distance_to_hq to only display the donors that meet those conditions?Filter out distances smaller than 50 miles.Filter out distance greater than 100 miles.Sort numerically in ascending order.Sort numerically in descending order. In the following piece of SQL code, what does the asterisk (*) represent?SELECT * FROM customersInclude all columns.Include all tables.Include the first column.Include specified conditions. You are working with a database table that contains customer data. The company column lists the company affiliated with each customer. You want to find customers from the company Riotur.You write the SQL query below.SELECT * FROM CustomerWhat code would be added to return only customers affiliated with the company Riotur?company = ‘Riotur’WHERE company = ‘Riotur’JOIN company = ‘Riotur’IN company = ‘Riotur’ Primary and foreign keys are two connected identifiers within separate tables. These tables exist in what kind of database?MetadataPrimaryRelationalNormalized When working with data from an external source, what can metadata help data analysts do? Select all that apply.Ensure data is clean and reliableCombine data from more than one sourceUnderstand the contents of a databaseChoose which analyses to run 3.Think about data as a student at a high school. In this metaphor, which of the following are examples of metadata? Select all that apply.Student’s ID numberStudent’s enrollment dateClasses the student is enrolled inGrades the student earns Fill in the blank: Data _____ is the process of ensuring the formal management of a company’s data assets.aggregationintegritymappinggovernance In what circumstance might a data analyst choose not to use external data in their analysis?The data represents diverse perspectivesThe data is too thoroughThe data is free for anyone to accessThe data cannot be confirmed to be reliable A nonprofit maintains a list of how many laptops they provide to each school in the county. In the table, there is a column called number_of_laptops. A data analyst wants to determine which schools were given the fewest laptops. How should they sort the data to return these schools first?Sort alphabetically in ascending orderSort numerically in descending orderSort alphabetically in descending orderSort numerically in ascending order 7.When writing a query, it's necessary for the name of the dataset to be inside two backticks in order for the query to run properly.TrueFalse You are working with a database table that contains customer data. The city column lists the city where each customer is located. You want to find out which customers are located in Berlin. You write the SQL query below. Add a WHERE clause that will return only customers located in Berlin.How many customers are located in Berlin?91227 Shuffle Q/ARelational databases contain a series of tables connected to form relationships. Which two types of fields exist in two connected tables?Star and snowflake schemasDescriptive and structural metadataInternal and external dataPrimary and foreign keys Data analysts use metadata for what tasks? Select all that apply.To combine data from more than one sourceTo perform data analysesTo interpret the contents of a databaseTo evaluate the quality of data Think about data as driving a taxi cab. In this metaphor, which of the following are examples of metadata? Select all that apply.Company that owns the taxiLicense plate numberMake and model of the taxi cabPassengers the taxi picks up Fill in the blank: Data governance is the process of ensuring that a company’s _____ are managed in a formal manner.business tasksdata engineersdata assetsbusiness strategies What are some key benefits of using external data? Select all that apply.External data is always reliable.External data is free to use.External data has broad reach.External data can provide industry-level perspectives. A data analyst reviews a national database of movie theater showings. They want to find the first movies shown in San Francisco in 2001. How can they organize the data to return the first 10 movies shown at the top of their list? Select all that apply.Filter out showings not in 2001Sort by date in descending orderSort by date in ascending orderFilter out showings outside of San Francisco You are working with a database table that contains customer data. The state column lists the state where each customer is located. The state names are abbreviated. You want to find out which customers are located in the state of Florida (FL).You write the SQL query below. Add a WHERE clause that will return only customers located in FL.How many customers are located in FL?6413 Structural metadata indicates how a piece of data is organized and whether it’s part of one or more than one data collection.TrueFalse Relational databases illustrate relationships between tables. Which fields represent the connection between these tables? Select all that apply.Foreign keysExternal keysPrimary keysSecondary keys When writing a query, you must remove the two backticks around the name of the dataset in order for the query to run properly.TrueFalse You are working with a database table that contains customer data. The first_name column lists the first name of each customer. You are only interested in customers with the first name Mark. You write the SQL query below. Add a WHERE clause that will return only customers named Mark.How many customers are named Mark?1532 Metadata is data about data. What kinds of information can metadata offer about a particular dataset? Select all that apply.How to combine the data with another datasetWhich analyses to perform on the dataIf the data is clean and reliableWhat kinds of data it contains A data analyst reviews a database of Wisconsin car sales to find the last car models sold in Milwaukee in 2019. How can they sort and filter the data to return the last five cars sold at the top of their list? Select all that apply.Filter out sales outside of MilwaukeeFilter out sales not in 2019Sort by sale date in descending orderSort by sale date in ascending order When writing a query, the name of the dataset can either be inside two backticks, or not, and the query will still run properly.TrueFalse A data analyst chooses not to use external data because it represents diverse perspectives. This is an appropriate decision when working with external data.TrueFalse Week 4 – Organizing and protecting your data A data analyst has been tasked with a new project and has started to collect data from multiple sources. The analyst will be working with multiple team members on this project and needs to create a naming convention to allow projects files to be located efficiently. What should the analyst include in each file's name? Select all that apply.ContentCollaboratorsVersion numberCreation date Your boss assigns you a new multi-phase project and you create a naming convention for all of your files. With this project lasting years and incorporating multiple analysts it’s crucial that you create data explaining how your naming conventions are structured. What is this data called?Named conventionLabeled dataMetadataDescriptive data A data analyst creates a file that lists people who donated to their organization’s fund drive. An effective name for the file is FundDriveDonors_20210216_V01.TrueFalse You have just started a new project and have created a naming convention for all of your files. Once the data has been collected you start foldering. What does the foldering process allow you to do?Organize your files into subfoldersSort your files by nameOrganize your files in the filing cabinetOrganize your files into the cloud A data analyst deletes an old project’s files from their active project folder. A few months later, they have to review the work that they completed on this project but cannot find the older project files. What should the data analyst have done?Archive the projectEmail the projectPrint the projectDelete the project As a data analyst, folder organization is key to being efficient at your job. A common practice is to lay out your folders with broad topics at the top with more specific topics at the bottom. What’s the name of this approach?Bottom to topLeft to rightHeterarchyHierarchy To reduce clutter, a data analyst hides cells that contain long, complex formulas. The hidden cells allow the data analyst to protect their formulas and hide the data from other users with access to the spreadsheet.TrueFalse Fill in the blank: File-naming conventions are _____ that describe a file's content, creation date, or version.general attributesfrequent suggestionsconsistent guidelinescommon verifications A data analytics team uses data about data to indicate consistent naming conventions for a project. What type of data is involved in this scenario?MetadataAggregated dataLong dataBig data Data analysts use naming conventions to help them identify or locate a file. Which of the following is an example of an effective file name?Elementary_Students_20090221_V03Sept_ElemtaryStudents_V1ElementarySchoolStudents_EnrollingSeptember2021_PlusRisingMiddleSchool_FJPSKVNDElem_9 Data analysts use a process called encryption to organize folders into subfolders.TrueFalse A data analyst completes a project. They move project files to another location to keep them separate from their current work. This is an example of what process?Duplicating filesDestroying filesArchiving filesRenaming files 6.Data analysts create hierarchies to organize their folders. How are folder hierarchies structured?Broad topics at the top, then more specific topics belowBroad topics at the right, then more specific topics at the leftSpecific topics at the top, then more broad topics belowBroad topics at the left, then more specific topics at the right Using encryption to protect data is an example of what?Data validationData integrityData ethicsData security To reduce clutter, a data analyst hides cells that contain long, complex formulas. To view the formulas again, the analyst will need to adjust the spreadsheet sharing or encryption settings.TrueFalse Shuffle Q/AA data analyst is working with a file from a customer satisfaction survey. The survey was sent to anyone who became a customer between April and June, 2020. Which of the following is an effective name for the file?April_May_June_2020_Responses_to_New_Customer_Survey_ANALYSISDATA_928310NewCustomerSurvey_2020-6-20_V03Survey_ResponsesApr-June2020_CustSurvey_V Foldering may be used by data analysts to organize folders into what?DatabasesSubfoldersVersionsTables Data analysts use archiving to separate current from past work. It also cuts down on clutter.TrueFalse Fill in the blank: Data analysts create _____ to structure their folders.scalessequencesladdershierarchies A data analyst wants to ensure only people on their analytics team can access, edit, and download a spreadsheet. They can use which of the following tools? Select all that apply.Sharing permissionsTemplatesFilteringEncryption A data analyst wants to share spreadsheet tab A with their team. They’re still working with tabs B and C, and they don’t want their team members to access them yet. Hiding tabs B and C will protect them from being accessed.TrueFalse A data analytics team labels its files to indicate their content, creation date, and version number. The team is using what data organization tool?File-naming verificationsFile-naming conventionsFile-naming attributesFile-naming references To align file naming and storage practices, it’s useful to develop metadata practices with your data analytics team.TrueFalse What process do data analysts use to keep project-related files together and organize them into subfolders?FolderingEncryptingEditingNaming A data analyst completes a project. They move project files to another location to keep them separate from their current work. This is an example of what process?Renaming filesArchiving filesDestroying filesDuplicating files A data analyst adds sharing permissions to limit who can edit the data contained within a file. This is an example of what?Data validationData integrityData securityData ethics What aspects of a file do file-naming conventions typically describe? Select all that apply.Creation dateContentVersion numberCollaborators Fill in the blank: A data analytics team uses _____ to indicate consistent naming conventions for a project. This is an example of using data about data.folder hierarchiesclassificationsmetadataversion control A data analyst creates a file that lists people who donated to their organization’s fund drive. An effective name for the file is FundDriveDonors_20210216_V01.TrueFalse Data analysts use archiving to separate current from past work. What does this process involve?Using secure data-erase software to destroy old filesReviewing current data files to confirm they’ve been cleanedMoving files from completed projects to another locationReorganizing and renaming current files Data analysts create hierarchies to organize their folders. They do this by structuring folders by specific topics at the top, then more broadly below.TrueFalse Course challenge Scenario 1, questions 1-5You’ve been working at a data analytics consulting company for the past six months. Your team helps restaurants use their data to better understand customer preferences and identify opportunities to become more profitable.To do this, your team analyzes customer feedback to improve restaurant performance. You use data to help restaurants make better staffing decisions and drive customer loyalty. Your analysis can even track the number of times a customer requests a new dish or ingredient in order to revise restaurant menus.Currently, you’re working with a vegetarian sandwich restaurant called Garden. The owner wants to make food deliveries more efficient and profitable. To accomplish this goal, your team will use delivery data to better understand when orders leave Garden, when they get to the customer, and overall customer satisfaction with the orders.Before project kickoff, you attend a discovery session with the vice president of customer experience at Garden. He shares information to help your team better understand the business and project objectives. As a follow-up, he sends you an email with datasets.Click below to read the email:C3 Scenario 1_Client Email .pdfPDF FileAnd click below to access the datasets:Course 3 Final Challenge Data Sets - Customer survey data (1)CSV FileCourse 3 Final Challenge Data Sets - Delivery times_distance (1)CSV FileReviewing the data enables you to describe how you will use it to achieve your client’s goals. First, you notice that all of the data is first-party data, which means that it was collected from outside sources.TrueFalse Scenario 1 continuedNext, you review the customer satisfaction survey data. To use the template for the customer satisfaction survey data, click the link below and select “Use Template.”Link to template: Customer Satisfaction Survey dataORIf you don’t have a Google account, download the CSV file directly from the attachment below.CustomerSurveyData - Customer survey dataCSV FileYou notice that the data in column E is an example of Boolean data. Why did you come to this conclusion?It has each subject in multiple rows.It is qualitative data with a set order or scale.It is organized in a certain format, such as rows and columns.It has only two possible values. Scenario 1 continuedNow, you review the data on delivery times and the distance of customers from the restaurant.To use the template for the dataset, click the link below and select “Use Template.”Link to template: Delivery Times/DistanceORIf you don’t have a Google account, download the CSV file directly from the attachment below.DeliveryTimes_DistanceData - Delivery times_distanceCSV File The data in column D is an example of nominal data.TrueFalse Scenario 2 continuedConsider and respond to the following question. Select all that apply.Our data analytics team often uses both internal and external data. Describe the difference between the two.External data came from a company’s own systems. Internal data came from the organization.External data is often generated from within the company. Internal data is generated outside the organization.Internal data came from a company’s own systems. External data comes from outside the organization.Internal data is often generated from within the company. External data is generated outside the organization. Scenario 2 continued For your final question, your interviewer explains that Sewati Financial Services needs its clients’ trust, and this is an important responsibility for the data analytics team.He asks you to identify which data analytics practice involves preserving a data subject’s information and activity any time a data transaction occurs.BiasEncryptionData privacySharing permissions Scenario 1, questions 1-5 You’ve been working at a data analytics consulting company for the past six months. Your team helps restaurants use their data to better understand customer preferences and identify opportunities to become more profitable. To do this, your team analyzes customer feedback to improve restaurant performance. You use data to help restaurants make better staffing decisions and drive customer loyalty. Your analysis can even track the number of times a customer requests a new dish or ingredient in order to revise restaurant menus. Currently, you’re working with a vegetarian sandwich restaurant called Garden. The owner wants to make food deliveries more efficient and profitable. To accomplish this goal, your team will use delivery data to better understand when orders leave Garden, when they get to the customer, and overall customer satisfaction with the orders. Before project kickoff, you attend a discovery session with the vice president of customer experience at Garden. He shares information to help your team better understand the business and project objectives. As a follow-up, he sends you an email with datasets. Reviewing the data enables you to describe how you will use it to achieve your client’s goals. First, you notice that all of the data was collected by Garden employees using their own resources. What type of data does this describe?Third-party dataFirst-party dataNominal dataQualitative data Scenario 1 continuedThe next thing you review is the file containing pictures of sandwich deliveries over a period of 30 days. This is unstructured data, which means what?It’s objective and measures facts.It’s not organized in an easily identifiable manner.It’s organized in a certain format.It’s collected by a group directly from its audience and then sold. Scenario 1 continuedNext, you review the customer satisfaction survey data. To use the template for the customer satisfaction survey data, click the link below and select “Use Template.”Link to template: Customer Satisfaction Survey dataORIf you don’t have a Google account, download the CSV file directly from the attachment below.The question in column E asks, “Was your order accurate? Please respond yes or no.” The responses listed in column E are an example of Boolean data.TrueFalse Scenario 1 continued Now, you review the data on delivery times and the distance of customers from the restaurant.To use the template for the dataset, click the link below and select “Use Template.”Link to template: Delivery Times/DistanceORIf you don’t have a Google account, download the CSV file directly from the attachment below.The data in column E shows the duration of deliveries from Garden to customers. What type of data is this? Select all that apply.Continuous dataQuantitative dataQualitative dataDiscrete data Scenario 1 continued The next thing you review is the file containing pictures of sandwich deliveries over a period of 30 days. This is an example of structured data.TrueFalse Scenario 1 continued Now that you’re familiar with the data, you want to build trust with the team at Garden. You decide to impress them by taking the initiative to reach out to your social media followers. You explain that Garden is a new client, and you show them the pictures of Garden’s sandwich deliveries from the client file. Then, you ask them if they have any photos of sandwich deliveries that you can evaluate.This is an example of going above and beyond expectations and a great way to build trust.TrueFalse Scenario 2, questions 6-10 You’ve completed this program and are interviewing for a junior data scientist position at a company called Sewati Financial Services.So far, you’ve successfully completed the first interview with a recruiter. They arrange your second interview with the team at Sewati Financial Services.You arrive 15 minutes early for your interview. Soon, you are escorted into a conference room, where you meet Kai Harvey, the senior manager of strategy. After welcoming you, he begins the behavioral interview.Consider and respond to the following question. Select all that apply.Our data analytics team often surveys clients to get their feedback. If you were on the team, how would you ensure the process does not cause potential bias?Make sure the wording of the survey question does not encourage a specific response from participants.Include clients with disabilities in the survey sample.Give participants enough time to answer each survey question.Instruct participants to share their name and contact information. Scenario 2 continued Consider and respond to the following question. Select all that apply.Our data analytics team often uses external data. Where can you access useful external data?A public databaseAn open-data websiteSewati Financial Services database in the cloudSewati Financial Services website Scenario 2 continued Consider and respond to the following question. Select all that apply.Our analysts often work within the same spreadsheet, but for different purposes. What tools would you use in such a situation?Freeze the header rowsSort the data to make it easier to understand, analyze, and visualizeFilter to show only the data that meets a specific criteriaEncrypt the spreadsheet so only you can access it Scenario 2 continued Next, your interviewer wants to better understand your knowledge of basic SQL commands. He asks: How would you write a query that retrieves only data about people who work in Boise from the Clients table in our database?Scenario 2 continued For your final question, your interviewer explains that Sewati Financial Services cares about data privacy. The company needs its clients’ trust, and this is an important responsibility for the data analytics team.He asks: What does data privacy involve? Select all that apply.Encryption and sharing permissionsPreserving a data subject’s information and activity any time a data transaction occursPutting privacy measures in place to protect people’s dataA person’s legal right to their data Shuffle Q/AScenario 1, questions 1-5 You’ve been working at a data analytics consulting company for the past six months. Your team helps restaurants use their data to better understand customer preferences and identify opportunities to become more profitable.To do this, your team analyzes customer feedback to improve restaurant performance. You use data to help restaurants make better staffing decisions and drive customer loyalty. Your analysis can even track the number of times a customer requests a new dish or ingredient in order to revise restaurant menus.Currently, you’re working with a vegetarian sandwich restaurant called Garden. The owner wants to make food deliveries more efficient and profitable. To accomplish this goal, your team will use delivery data to better understand when orders leave Garden, when they get to the customer, and overall customer satisfaction with the orders.Before project kickoff, you attend a discovery session with the vice president of customer experience at Garden. He shares information to help your team better understand the business and project objectives. As a follow-up, he sends you an email with datasets.Click below to read the email:And click below to access the datasets:Reviewing the data enables you to describe how you will use it to achieve your client’s goals. First, you notice that all of the data is first-party data. What does this mean?It’s data that was collected from outside sources.It’s data that was collected by Garden employees using the company’s own resources.It’s a type of data that’s categorized without a set order.It’s subjective data that measures qualities and characteristics. Scenario 1 continued Next, you review the customer satisfaction survey data. To use the template for the customer satisfaction survey data, click the link below and select “Use Template.”Link to template: Customer Satisfaction Survey dataORIf you don’t have a Google account, download the CSV file directly from the attachment below.You notice that the data in column E is an example of Boolean data. Why did you come to this conclusion?It has each subject in multiple rows.It is qualitative data with a set order or scale.It has only two possible values.It is organized in a certain format, such as rows and columns. Scenario 1 continued Now, you review the data on delivery times and the distance of customers from the restaurant.To use the template for the dataset, click the link below and select “Use Template.”Link to template: Delivery Times/DistanceORIf you don’t have a Google account, download the CSV file directly from the attachment below.Fill in the blank: The data in column E is an example of _____ data. Select all that apply.continuousqualitativediscretequantitative Scenario 1 continued The next thing you review is the file containing pictures of sandwich deliveries over a period of 30 days. What type of data is this?OrdinalUnstructuredDiscreteRelational Scenario 1 continued Now that you’re familiar with the data, you want to build trust with the team at Garden.What actions should you take when working with their data? Select all that apply.Keep the data safe by implementing data-security measures, such as password protection and user permissions.Share the client’s data with other delivery restaurants to compare performance.Post on social media that you’re working with Garden and would like feedback from any of your contacts who have ordered there before.Organize the data using effective naming conventions. Scenario 2, questions 6-10 You’ve completed this program and are interviewing for a junior data scientist position at a company called Sewati Financial Services.Click below to review the job description:So far, you’ve successfully completed the first interview with a recruiter. They arrange your second interview with the team at Sewati Financial Services.Click below to read the email from the human resources director:You arrive 15 minutes early for your interview. Soon, you are escorted into a conference room, where you meet Kai Harvey, the senior manager of strategy. After welcoming you, he begins the behavioral interview.Consider and respond to the following question. Select all that apply.Our data analytics team often surveys clients to get their feedback. If you were on the team, how would you ensure the sample is representative of the population as a whole?Only include participants who can answer survey questions in a timely manner.Make sure the sample is chosen at random.Include clients with disabilities in the survey sample.Use a randomized sample of the population that includes all genders. Scenario 2 continued Next, your interviewer wants to better understand your knowledge of basic SQL commands. He asks: How would you write a query that retrieves only data about people who joined our firm in 2019 from the Clients table in our database? Scenario 2 continued For your final question, your interviewer explains that Sewati Financial Services cares about its clients’ trust, and this is an important responsibility for the data analytics team. They do this by:protecting clients from unauthorized access to their private dataensuring freedom from inappropriate use of client datagetting consent to use someone’s dataHe asks: Which data analytics practice does this describe?EncryptionBiasData privacySharing permissions Scenario 1 continued Next, you review the customer satisfaction survey data. To use the template for the customer satisfaction survey data, click the link below and select “Use Template.”Link to template: Customer Satisfaction Survey dataORIf you don’t have a Google account, download the CSV file directly from the attachment below.The question in column E asks, “Was your order accurate? Please respond yes or no.” What kind of data is this?Second-party dataOrdinal dataClean dataBoolean data Scenario 1 continued The next thing you review is the file containing pictures of sandwich deliveries over a period of 30 days. This is unstructured data, which means what?It’s organized in a certain format.It’s not organized in an easily identifiable manner.It’s objective and measures facts.It’s collected by a group directly from its audience and then sold. Scenario 2 continued Consider and respond to the following question. Select all that apply.Our data analytics team often uses external data. Where can you locate useful external data?Other financial businessesSewati Financial Solutions marketing departmentGovernment sourcesA professional finance association Scenario 2 continued Consider and respond to the following question.Our analysts often work within the same spreadsheet, but for different purposes. How could filtering help in this situation?Filtering enables you to highlight the header rowFiltering enables you to sort the data in a meaningful orderFiltering simplifies a spreadsheet by only showing you the information you needFiltering encrypts the spreadsheet so only you can access it Course 4 – Process Data from Dirty to Clean Week 1 – The importance of integrity Fill in the blank: As a data analyst, you need to verify that your data is _____ to ensure your analysis and conclusions are accurate.complete and validprivate and validmanipulated and replicatedmanipulated and valid A data analyst is given a dataset for analysis. It includes data only about the total population of every country in the previous 20 years. Based on the available data, an analyst would have the full picture and be able to determine the reasons behind a certain country's population increase from 2016 to 2017.TrueFalse A data analyst is given a dataset for analysis. To use the template for this dataset, click the link below and select “Use Template.”Link to template: June 2014 InvoicesORIf you don’t have a Google account, download the CSV file directly from the attachment below.June 2014 Invoices - Sheet1June 2014 Invoices - Sheet1CSV FileThe analyst notices a limitation with the data in rows 8 and 9. What is the limitation?Row 8 and row 9 show the wrong currency.Row 9 needs more data.Row 9 is a duplicate of row 8.Row 8 is not in the correct format. A data analyst is working on a project about the global supply chain. They have a dataset with lots of relevant data from Europe and Asia. However, they decide to generate new data that represents all continents. What type of insufficient data does this scenario describe?Data from only one sourceData that’s outdatedData that keeps updatingData that’s geographically limited In the data analysis process, how does a sample relate to a population?A sample is a duplicate selection of data that is taken from the population.A sample is an average of all the data that represents the population.A sample is an ideal example taken from a population.A sample is a part of a population that is representative of the population. A restaurant wants to gather data about a new dish by giving out free samples and asking for feedback. Who should the restaurant give samples to?Diners who spend the most money on their mealAll dinersSelecting diners at randomDiners who are willing to pay for the samples Fill in the blank: If a data analyst is using data that has been _____, the data will lack integrity and the analysis will be faulty.widecompromisedpublicclean A financial analyst imports a dataset to their computer from a storage device. As it’s being imported, the connection is interrupted, which compromises the data. Which of the following processes caused the compromise?Data analysisData gatheringData manipulationData transfer A data analyst is given a dataset for analysis. It includes data about the total population of every country in the previous 20 years. Based on the available data, an analyst would be able to determine the reasons behind a certain country's population increase from 2016 to 2017.TrueFalse A data analyst is given a dataset for analysis. To use the template for this dataset, click the link below and select “Use Template.” Link to template: June 2014 Invoices OR If you don’t have a Google account, download the CSV file directly from the attachment below. Which of the following has duplicate data?Data for Symteco on 2/21/2014Data for Symteco on 5/20/2014Data for Valando on 2/18/2014Data for Valando on 1/1/2014 A data analyst at a nonprofit organization is working with a dataset about a summer fundraiser. Although they have a lot of useful data by the end of the month, they recognize that the data is insufficient. So, they decide to wait until the end of the season to begin working with the dataset. Which type of insufficient data does this example describe?Outdated dataData from only one sourceGeographically limited dataData that keeps updating When gathering data through a survey, companies can save money by surveying 100% of a population.TrueFalse 7.Fill in the blank: Sampling bias in data collection happens when a sample isn’t representative of _____.the population as a wholea dataset about the populationa subset of the populationthe population most affected by the data Data and business objectives might not align for a number of reasons. Which of the following issues can prevent alignment? Select all that apply.Sampling biasData integrityData visualizationInsufficient data Shuffle Q/AWhich of the following conditions are necessary to ensure data integrity? Select all that apply.PrivacyCompletenessStatistical powerAccuracy What is one potential problem associated with data manipulation that analysts must be aware of?Data manipulation can separate a dataset among different locations.Data manipulation can help organize a dataset.Data manipulation can introduce errors.Data manipulation can make a dataset easier to read. As a data analyst, you are working for a national pizza restaurant chain. You have a dataset with monthly order totals for each branch over the past year. With only this data, what questions can you answer?Which region had the highest sales over the last two years?Which branch will be the most profitable over the next year?What was the most popular item on the menu?Which branch had the most orders in the last month of last year? A data analyst is given a dataset for analysis. To use the template for this dataset, click the link below and select “Use Template.”Link to template: June 2014 InvoicesORIf you don’t have a Google account, download the CSV file directly from the attachment below.June 2014 Invoices - Sheet1The data analyst is asked to find the average estimate for Symteco over the past three years. What limitation of the data makes this impossible?The data uses the wrong currency.The data is all from a single year.The data does not include Symteco.The data does not include estimates. A data analyst at a software company wants to learn more about industry competitors. Because the software industry has more mergers than any other field, the companies and their products are constantly evolving. The analyst has a dataset from three years ago, and they notice that many of the companies and products in the dataset have changed. What makes the analyst decide that the data is insufficient, so they should generate fresh data instead?It is outdated data.It is geographically limited data.It is data that keeps updating.It is data from only one source. A restaurant gathers data about a new dish by providing free samples to parties of six or more diners. What does this scenario describe?Random samplingUnbiased samplingGeographically limited samplingSampling bias Which of the following processes helps ensure a close alignment of data and business objectives?Completing data replicationTransferring data multiple timesMaintaining data integrityHaving data update automatically during analysis What can jeopardize data integrity throughout its lifecycle? Select all that apply.Insufficient dataHuman errorMalwareSystem failures A healthcare company keeps copies of their data at several locations across the country. The data becomes compromised because each location creates a copy of the original at different times of day. Which of the following processes caused the compromise?Data gatheringData manipulationData transferData replication A data analyst is given a dataset for analysis. It includes data about the total population of every country in the previous 20 years. Which of the following questions would the analyst need more data to address?Which country had the smallest population in 2017?Which country had the greatest population in 2015?What was the reason for the population increase in a certain country?What was the population of a certain country in 2020? A data analyst is given a dataset for analysis. To use the template for this dataset, click the link below and select “Use Template.” Link to template: June 2014 Invoices OR If you don’t have a Google account, download the CSV file directly from the attachment below. June 2014 Invoices - Sheet1 Which of the following are limitations of this dataset?Identifying the most profitable clients between January and November of 2014Identifying the least profitable clients between January and November of 2014Identifying the worst paying client between March and December of 2014Identifying the best paying client between January and November of 2014 A car manufacturer wants to learn more about the brand preferences of electric car owners. There are millions of electric car owners in the world. Who should the company survey?A sample of all electric car ownersThe entire population of electric car ownersA sample of car owners who have owned more than one electric carA sample of car owners who most recently bought an electric car A candy manufacturer finds an even distribution of sales across all age ranges of customers who purchase their products. The manufacturer decides to conduct a survey to learn more about its customer base. Due to age requirements, they can only send the survey to customers who are 21 years or older. This scenario can be described as what?Down sampling biasSampling biasUnbiased samplingUpsampling bias What best describes a sample size?A subset of the population between the 25th and 50th percentileA random subset of the populationA subset that is representative of the population as a wholeA subset of the population excluding outliers Fill in the blank: In order to have a strong and thorough analysis, a data analyst must verify _____.data replicationdata manipulationdata engineeringdata integrity Fill in the blank: _____ is the process of changing data to make it more organized and easier to read.Data transferData manipulationData gatheringData replication You are working for a global technology company. You have a dataset with the company’s total cell phone sales by country from 2015 to present. Based on the data you have, what questions are you able to answer?What was the effect on sales when a new phone model was launched?What was the effect on sales when new phone features were introduced?What countries have the most cell phone sales in the past three years?What are the mean cell phone sales for each country since 2010? A data analyst, working for a publishing company, gathers a dataset which includes all books sold in the United Kingdom over the last three years. However, they decide to generate new data that represents global book sales. What type of insufficient data does this scenario describe?Data that keeps updatingData that is outdatedData that is geographically limitedData from only one source A company is trying to learn more about their customer base. They would like to conduct a survey to understand why their customers chose their brand. How should the company survey its customers?Conduct a survey of customers who purchased a different brandConduct a survey of customers that live in high-income areasConduct a survey with a representative sample of their customer populationConduct a survey with customers who have purchased more than five products Sometimes during analysis, an analyst discovers that it’s necessary to adjust the business objective. When this happens, the analyst should take the initiative to do so without involving others in order to be respectful of their time.TrueFalse A car dealership gathers data about their entire customer population. They decide to conduct a survey to understand why their customers chose their dealership. They send out an email to all customers who have purchased more than two vehicles in the past five years. What does this scenario describe?Unbiased samplingGeographically limited samplingRandom samplingSampling bias A data analyst needs to migrate data from a server located at their company's headquarters to a remote site. This can lead to what type of data integrity issue?Data replicationData cleaningData transferData manipulation As a data analyst, you work with data about the life expectancy of sea turtles in the Coral Triangle. The dataset contains an estimated birthdate and deathdate for all tracked sea turtles. With the data you have, what questions are you able to answer?What is the median age a sea turtle has lived in the Coral Triangle?Where is the most prevalent location sea turtles are being hatched in the Coral Triangle?What is the largest sea turtle ever recorded?Is the sea turtle population increasing throughout the world? A clothing manufacturer wants to learn more about why their consumers have purchased the brand’s products. How should this manufacturer conduct their survey?Send the survey to a representative sample of their customersSend the survey to customers who have purchased more than one productSend the survey to their least frequent customersSend the survey to random people who buy clothes Week 2 – Sparkling-clean data Fill in the blank: Conditional formatting is a spreadsheet tool that changes how _____ appear when values meet a specific condition.chartsfilterscellsqueries For a function to work properly, data analysts must follow each function’s predetermined structure. What is this structure called?ValidationSyntaxAlgorithmSummary An analyst is cleaning a new dataset. They want to make sure the data contained from cell B2 through cell B100 does not contain a number smaller than 10. Which COUNTIF function syntax can be used to answer this question?=COUNTIF(B2:B100,"<9")=COUNTIF(B2:B100,”>=10”)=COUNTIF(B2:B100,>50)=COUNTIF(B2:B200, ”<=50”) VLOOKUP searches for a value in a row in order to return a corresponding piece of information.TrueFalse To evaluate how well two or more data sources work together, data analysts use data mapping.TrueFalse As part of the data-cleaning process, a data analyst creates a rule to highlight any empty cells in a bright blue color. This is an example of data visualization.TrueFalse 2.A data analyst at a nonprofit organization is working with the following spreadsheet, which contains member name data in column C. They want to divide this data using the underscore as a delimiter, so that first names are stored in one column and last names in another. Which tool should the analyst use?Conditional formattingPivot tableSPLIT functionMID function 3.Fill in the blank: When describing a SUM function, the _____ is =SUM(value 1 through value 2).syntaxstandardstructurescript 4.You are working with the following selection of a spreadsheet: In order to extract the five-digit postal code from Burlington, MA, what is the correct function?=RIGHT(B3,5)=RIGHT(5,B3)=LEFT(5,B3)=LEFT(B3,5) A data analyst in a human resources department is working with the following selection of a spreadsheet:They want to create employee identification numbers (IDs) in column D. The IDs should include the year hired plus the last four digits of the employee’s Social Security Number (SS#). What function will create the ID 20093208 for the employee in row 5?=CONCATENATE(A5!B5)=CONCATENATE(A5*B5)=CONCATENATE(A5+B5)=CONCATENATE(A5,B5) A data analyst at an e-commerce company is working with a spreadsheet containing last month's sales. The most expensive product their company sells costs $49.99, so they want to quickly confirm that all of the data in the Sales column is $49.99 or less. What function can they use?SUMIFCOUNTIFCOUNTSUM A data analyst wants to search for a certain value in a column, then return a corresponding piece of information. Which function should they use?VALUEVLOOKUPMATCHFIND A data analyst needs to combine two datasets. Each dataset comes from a different system, and the systems store data in different ways. What can the data analyst do to ensure the data is compatible?Use a data visualizationMap the dataApply a data structureMerge the data Shuffle Q/AIn their spreadsheet, a data analyst makes cells stand out for more efficient analysis. What spreadsheet tool is used to do this?Cell filteringConditional rankingConditional formattingCell querying A data analyst uses the SPLIT function to divide a text string around a specified character and put each fragment into a new, separate cell. What is the specified character separating each item called?UnitDelimiterPartitionSubstring A data analyst is using a function in a spreadsheet. For the function to work correctly, they follow the function’s syntax. What does this entail?It is the function’s name and placement.It is how the function can be used in a program.It is the function’s required information and its proper placement.It is the purpose of the function and its use. In a spreadsheet, what is the correct function for extracting the first two characters of the string located in cell A7?=LEFT(A7,2)=LEFT(2,A7)=RIGHT(A7,2)=RIGHT(2,A7) Fill in the blank: In a spreadsheet, the function VLOOKUP is used to _____ information in a column based on a specified data value.returnreplacetransformdelete What describes syntax?It is the function’s required information and its proper placement.It is how the function can be used in a program.It is the purpose of the function and its use.It is the function’s name and placement. A data analyst in a human resources department is working with the following selection of a spreadsheet:They want to create employee identification numbers (IDs) in column D. The IDs should include the last four digits of the employee’s Social Security Number(SS#) plus the year hired. What function will create the ID 19392020 for the employee in row 4?=CONCATENATE(B4+A4)=CONCATENATE(B4,A4)=CONCATENATE(A4+B4)=CONCATENATE(A4!B4) An analyst is cleaning a new dataset. They want to determine how many of the cells in column F have a value of 0. However, they only want rows 7 to 120 to be considered. Which COUNTIF function syntax can be used to answer this question?=COUNTIF(F2:F1250, 0)=COUNTIF(F7:F120, =0)=COUNTIF(F7:F120,”0″)=COUNTIF(F7:F120,”=0”) A data analyst needs to combine two datasets. Each dataset comes from a different system, and the systems store data in different ways. What can the data analyst do to ensure the data is compatible prior to analyzing the data?Use a data visualizationMap the dataSpot check for null valuesApply a data structure A data analyst is working on a spreadsheet in which one of the columns contains name data. This data is formatted as lastname_firstname. The analyst splits this data at the underscore so that each piece—firstname and lastname—are contained in their own columns. In this context, what is the underscore acting as?PartitionDelimiterSubstringMID function A data analyst is using a function in a spreadsheet. When they input the function, they follow a predetermined structure that includes all required information for the function and its proper placement. What aspect of a function does this describe?The specified value of the functionThe syntax of the functionThe length of the functionThe number of characters in the function You are working with the following selection of a spreadsheet:In order to extract the five-digit postal code from Brandon, FL, what is the correct function?=RIGHT(5,B4)=RIGHT(B4,5)=LEFT(B4,5)=LEFT(5,B4) A data analyst in a human resources department is working with the following selection of a spreadsheet:They want to create employee identification numbers (IDs) in column D. The IDs should include the last four digits of the employee’s Social Security Number(SS#) plus the year hired. What function will create the ID 32082009 for the employee in row 5?=CONCATENATE(B5,A5)=CONCATENATE(A5!B5)=CONCATENATE(A5+B5)=CONCATENATE(B5+A5) Before analyzing a dataset, an analyst maps the data. What is the reason for doing this?The analyst wants to know what attributes the data has.The analyst thinks the dataset might have some null values.The dataset has no visualizations.The dataset contains data from different sources. A data analyst suspects that there are many blank cells in their spreadsheet corresponding to missing information. What spreadsheet tool can they use to identify only those cells containing the null values?Conditional rankingConditional formattingCell queryingCell filtering A data analyst is working on a spreadsheet in which one of the columns is name data. This data is formatted as lastname, firstname. The analyst chooses to divide this data into two new columns, one containing the firstname data and the other containing the lastname data. What spreadsheet tool would they use to do this?The MID functionThe SPLIT functionSubstring formattingConditional formatting Fill in the blank: The function _____ is used to return information in a column that contains a specified value.VALUEMATCHVLOOKUPFIND In a spreadsheet, what function would you use to extract the last three characters of the string located in row 4, column C?=RIGHT(3,C4)=LEFT(C4,3)=LEFT(3,C4)=RIGHT(C4,3) Week 3 – Cleaning data with SQL In which of the following situations would a data analyst use spreadsheets instead of SQL? Select all that apply.When working with a small datasetWhen visually inspecting dataWhen using a language to interact with multiple database programsWhen working with a dataset with more than 1,000,000 rows In SQL databases, what data type is the value 78.99 an example of?IntegerStringBooleanFloat Fill in the blank: Data analysts usually use _____ to deal with very large datasets.web browsersspreadsheetsSQLword processors 2.What are some of the benefits of using SQL for analysis? Select all that apply.SQL interacts with database programs.SQL tracks changes across a team.SQL has built-in functionalities.SQL can pull information from different database sources. A data analyst creates many new tables in their company’s database. When the project is complete, the analyst wants to remove the tables so they don’t clutter the database. What SQL commands can they use to delete the tables?CREATE TABLE IF NOT EXISTSDROP TABLE IF EXISTSUPDATEINSERT INTO You are working with a database table that contains invoice data. The table includes columns for invoice_id and customer_id. You want to remove duplicate entries for customer ID and sort the results by invoice ID. You write the SQL query below. Add a DISTINCT clause that will remove duplicate entries from the customer_id column. NOTE: The three dots (...) indicate where to add the clause.What customer ID number appears in row 12 of your query result?2342168 You are working with a database table that contains customer data. The table includes columns about customer location such as city, state, country, and postal_code. You want to check for postal codes that are greater than 7 characters long. You write the SQL query below. Add a LENGTH function that will return any postal codes that are greater than 7 characters long.What is the last name of the customer that appears in row 10 of your query result?RochaBrooksHughesRamos A data analyst is cleaning transportation data for a ride-share company. The analyst converts the data on ride duration from text strings to floats. What does this scenario describe?VisualizingProcessingCalculatingTypecasting The CAST function can be used to convert the DATE datatype to the DATETIME datatype.TrueFalse Fill in the blank: The _____ function can be used to return non-null values in a list.TRIMCOALESCECASTCONCAT You are working with a database table that contains employee data. The table includes columns about employee location such as city, state, country, and postal_code. You want to retrieve the first 3 characters of each postal code. You decide to use the SUBSTR function to retrieve the first 3 characters of each postal code, and use the AS command to store the result in a new column called new_postal_code. You write the SQL query below. Add a statement to your SQL query that will retrieve the first 3 characters of each postal code and store the result in a new column as new_postal_code. NOTE: The three dots (...) indicate where to add the statement.What employee ID number appears in row 5 of your query result? NOTE: The query index starts at 1 not 0.3187 Shuffle Q/AWhy do data analysts choose to work with SQL? Select all that apply.SQL can handle huge amounts of data.SQL is a powerful software program.SQL is a well-known standard in the professional community.SQL is a programming language that can also create web apps. A team of data analysts is working on a large project that will take months to complete and contains a huge amount of data. They need to document their process and communicate with multiple databases. The team decides to use a SQL server as the main analysis tool for this project and SQL for the queries. What makes this the most efficient tool? Select all that apply.SQL efficiently handles large amounts of data.SQL records queries and changes throughout a project.SQL contains commands that build visualizations.SQL allows you to connect to multiple databases. Fill in the blank: _____ refers to the process of converting data from one type to another.FormattingCleaningTypecastingQuerying A data analyst is working with product sales data. They import new data into a database. The database recognizes the data for product price as text strings. What SQL function can the analyst use to convert text strings to floats?LENGTHTRIMSUBSTRCAST Fill in the blank: The _____ function can be used to join strings to create a new column. CASTCOALESCETRIMCONCAT As a data analyst, you are working on a quick project containing a small amount of data. As the data was emailed to you, there is no need to query the data. What tool should you use to perform your analysis?SpreadsheetSQLword processCSV A data analyst has added a massive table to their database on accident and needs to remove the table. What command can the analyst use to correct their mistake?DROP TABLE IF NOT EXISTSINSERT INTOREMOVE TABLE IF EXISTSDROP TABLE IF EXISTS You are working with a database table that contains invoice data. The table includes a column for customer_id. You want to remove duplicate entries for customer_id and get a count of total customers in the database. You write the SQL query below. Add a DISTINCT clause that will remove duplicate entries from the customer_id column. NOTE: The three dots (...) indicate where to add the clause.What is the total number of customers in the database?841054359 In SQL databases, what data type refers to a number that does not contain a decimal?StringIntegerBooleanFloat After joining multiple tables you find your data contains a significant amount of null values. What function can you use to return only the non-null values in a list ?CASTCOALESCETRIMCONCAT You are working with a database table that contains customer data. The table includes columns about customer location such as city, state, and country. The state names are abbreviated. You want to retrieve the first 2 letters of each state name. You decide to use the SUBSTR function to retrieve the first 2 letters of each state name, and use the AS command to store the result in a new column called new_state. You write the SQL query below. Add a statement to your SQL query that will retrieve the first 2 letters of each state name and store the result in a new column as new_state. NOTE: The three dots (...) indicate where to add the statement. NOTE: SUBSTR takes in three arguments being column, starting_index, ending_indexWhat customer ID number is in row 9 of your query result? NOTE: The query index starts at 1 not 0. A junior data analyst joins a new company. The analyst learns that SQL is heavily utilized within the organization. Why would the organization choose to invest in SQL? Select all that apply.SQL is a programming language that can also create web apps.SQL can handle huge amounts of data.SQL is a powerful software program.SQL is a well-known standard in the professional community. You are working with a database table that contains invoice data. The table includes columns for invoice_id and billing_state. You want to remove duplicate entries for billing state and sort the results by invoice ID. You write the SQL query below. Add a DISTINCT clause that will remove duplicate entries from the billing_state column. NOTE: The three dots (...) indicate where to add the clause.What billing state appears in row 17 of your query result?NOTE: The query index starts at 1 not 0.AZNVCAWI You are working with a database table that contains customer data. The table includes columns about customer location such as city, state, country, and postal_code. You want to check for city names that are greater than 9 characters long. You write the SQL query below. Add a LENGTH function that will return any city names that are greater than 9 characters long.What is the first name of the customer that is in row 7 of your query result?NOTE: The query index starts at 1 not 0.DiegoKaraJuliaRoberto In SQL databases, what data type refers to a number that contains a decimal?BooleanFloatIntegerString You’re working with a dataset that contains a float column with a significant amount of decimal places. This level of granularity is not needed for your current analysis. How can you convert the data in the float column to be integer data?CASTCOALESCETRIMCONCAT What SQL function lets you add strings together to create new text strings that can be used as unique keys?CASTCOALESCETRIMCONCAT What are some of the benefits of using SQL for analysis? Select all that apply.SQL interacts with database programs.SQL has better user management than spreadsheets.SQL can pull information from different database sources.SQL tracks changes across a team. A data analyst is managing a database of customer information for a retail store. What SQL command can the analyst use to add a new customer to the database?UPDATECREATE TABLE IF NOT EXISTSDROP TABLE IF EXISTSINSERT INTO In SQL databases, True/False values refers to what data type?StringFloatIntegerBoolean A data analyst is tasked with identifying what orders are still in transit. The current list of orders contains trillions of rows. What is the best tool for the analyst to use?SpreadsheetsCSVSQLWord processor Your manager tasks you with analyzing a dataset and visually inspecting the data. Upon initial inspection you realize that this is a small dataset. What tool should you use to analyze the data?CSVSpreadsheetSQLWord processor A data analyst creates a database to store information on the company's customer data. When completing the initial import the analyst notices that they forgot to add a few customers into the table. What command can the analyst use to add these missed customers?ADDAPPENDINSERT INTODROP You are working with a database table that contains customer data. The table includes columns about customer location such as city, state, country, and postal_code. You want to find what state names are greater than 3 characters. You write the SQL query below. Add a LENGTH function that will return any state names that are greater than 3 characters long.What state is in row 1 of your query result? NOTE: The query index starts at 1 not 0.IndiaChileDublinIreland In SQL databases, what function can be used to convert data from one datatype to another?CASTLENGTHTRIMSUBSTR After a company merger, a data analyst receives a dataset with billions of rows of data. They need to leverage this data to identify insights for upper management. What tool would be most efficient for the analyst to use?SpreadsheetWord processorSQLCSV You are working with a database table that contains customer data. The table includes columns about customer location such as city, state, country, and postal_code. The state names are abbreviated. You want to check for state names that are greater than 2 characters long. You write the SQL query below. Add a LENGTH function that will return any state names that are greater than 2 characters long.What country is in row 1 of your query result? NOTE: The query index starts at 1 not 0.IrelandIndiaFranceChile You are working with a database table that contains employee data. The table includes columns about employee location such as city, state, country, and postal_code. You use the SUBSTR function to retrieve the first 3 characters of each last_name, and use the AS command to store the result in a new column called new_last_name. You write the SQL query below. Add a statement to your SQL query that will retrieve the first 3 characters of each last_name and store the result in a new column as new_last_name. NOTE: The three dots (...) indicate where to add the statement. NOTE: SUBSTR takes in three arguments being column, starting_index, ending_indexWhat employee ID number is in row 8 of your query result?NOTE: The query index starts at 1 not 0.7318 Week 4 – Verify and report on your cleaning results What is involved in seeing the big picture when verifying data cleaning? Select all that apply.Consider the reportingConsider the dataConsider the business problemConsider the goal Fill in the blank: A data analyst uses the CASE statement to consider one or more _____, then return a value.identificationsconditionschangesfields A data analyst uses a changelog to record how the data evolves while cleaning their data. What data cleaning best practice does this describe?ExaminationDisclosureIlluminationDocumentation Verification and reporting come directly before the data-cleaning process.TrueFalse What is the first step in the verification process?Compare cleaned data with the original, uncleaned dataset and compare it to what is there nowCreate a chronological list of modifications made to the dataDetermine the quality of the dataInform others of your data-cleaning effort Which of the following functions automatically remove extra spaces when cleaning data?SNIPREMOVETRIMCLEAR What tool can a data analyst use to figure out how many identical errors occur in a dataset?CASECOUNTACONFIRMCOUNT Fill in the blank: A data analyst uses the CASE statement to consider one or more _____, then returns a value.additionsconditionsidentificationschanges What is the process of tracking changes, additions, deletions, and errors during data cleaning?RecordingObservationCatalogingDocumentation Fill in the blank: While cleaning data, a data analyst can use a changelog to keep a chronological list of changes they make. They can refer to it during the _____ period if there are errors or questions.presentingverificationdocumentationvisualization Reviewing version history is an effective way to view a changelog in SQL.TrueFalse Shuffle Q/A In what step of the data-cleaning process do you find mistakes before you begin analyzing the data?ConfirmingPublishingVerifyingProcessing During the data cleaning process you find a significant amount of data that contains irrelevant spaces. Which function do you use to remove leading, trailing, or repeated spaces?CUTDELETETRIMTIDY A data analyst is checking for errors in a dataset. They want to determine how many times the name of a country is in the dataset using a pivot table. What function can they use to find this count?COUNTACHECKCOUNTCASE You’re writing the below SQL query and need to change “World Wide Web” to “www”. What function would you use to accomplish this task?SELECT_____WHEN ‘World Wide Web’ THEN ‘www’END AS some_columnFROMsome_tableTHENCASEELSEWHEN What should a data analyst actively track throughout the data cleaning process?Additions, changes, and queriesErrors, deletions, and notesChanges, resolutions, and deletionsErrors, additions, and deletions A data analyst is in the verification process and needs to verify the modifications that they have made to the data. What could the analyst reference to find the changes they made throughout data cleaning?ChangelogNotepadSpreadsheetMetadata A data analyst commits a query to the repository as a new and improved query. Then, they specify the changes they made and why they made them. This scenario is part of what process?Reporting dataVisualizing dataCommunicating with stakeholdersCreating a changelog The data collected for an analysis project has just been cleaned. What are the next steps for a data analyst? Select all that apply.ReportingCertificationValidationVerification As a data analyst, you will need to keep the big picture in mind throughout any project when verifying data cleaning. What must the analyst do to take a big picture view of the project? Select all that apply.Consider the dataConsider the goalConsider the business problemConsider the reporting During the verification process, you find that you missed a few leading spaces during data cleaning. What function can you use to eliminate these spaces?TRIMTIDYCUTCROP Which SQL tool considers one or more conditions, then returns a value as soon as a condition is met?THENWHENCASEELSE Fill in the blank: Documentation is the process of tracking _____ during data cleaning. Select all that apply.additionsdeletionschangesinactivity Fill in the blank: A changelog contains a _____ list of modifications made to a project.randomapproximatechronologicalsynchronized You start a complex project that will take more than a year to complete. You need to document modifications made to your queries throughout the project. What is the correct way to store these modifications?Creating a changelogCreating a notepadVisualizing dataCreating a spreadsheet Fill in the blank: A process to confirm that a data-cleaning effort was well-executed and the resulting data is accurate and reliable is known as _____.verificationpublishingmanipulationprocessing A data analyst is in the verification step. They consider the business problem, the goal, and the data involved in their analytics project. What scenario does this describe?Reporting on the dataConsidering the stakeholdersSeeing the big pictureVisualizing the data During data cleaning, you find an error in a username where the ID number was accidentally joined to the user’s last name. You need to figure out if this username has been entered incorrectly more than once in your datasett. If you use a pivot table, what function can you use to determine the number of times this error occurs in your dataset?CASECOUNTCOUNTACHECK You’re working with a dataset that contains categorical variables. You notice that some of the strings are misspelled or are not capitalized. What function can you use to fix these errors when a condition is met?ELSECASEWHENTHEN A data analyst uses a changelog while cleaning data. What process does a changelog support?IlluminationExaminationDisclosureDocumentation A changelog is essential for storing chronological modifications made during the data cleaning process. When will an analyst refer to the information in the changelog to certify data integrity?DocumentationVerificationPresentingVisualization Fill in the blank: As a data analyst, you should always create a _____ to track your additions, deletions, errors, and changes to a query.notepaddatabasechangelogspreadsheet Fill in the blank: TRIM is a function that removes _____ spaces in data. Select all that apply.repeatedtrailingleadinginner While verifying cleaned data, a data analyst encounters a misspelled name. Which function can they use to determine the number of misspelled occurrences in the dataset?CASECHECKCHECKCOUNTA At what point during the analysis process does a data analyst use a changelog?While cleaning the dataWhile visualizing the dataWhile gathering the dataWhile reporting the data Your manager points out an error in a product ID number in your dataset. The Product IDs can be numbers like 42 or text like "CAD-425". Using a pivot table, what function can you use to find how many times this error occurs in the dataset?COUNTCHECKCOUNTACASE While reviewing your coworker’s data cleaning process, you find a few cases of trailing spaces in the data. What function can you use to remove these spaces?REMOVE TRAILINGDELETECUTTRIM Which of the following queries considers one or more conditions and returns a value as soon as that condition is met?SELECT * WHEN CASE COLUMN = VARIABLESELECT * CASE IF COLUMN = VARIABLESELECT * CASE WHEN COLUMN = VARIABLESELECT * IF CASE COLUMN = VARIABLE Course challenge Scenario 2, questions 6-10You’ve completed this program and are interviewing for a junior data scientist position. The job is at B.Spoke Market Research, a company that analyzes market conditions using customer surveys and other research methods. The detailed job description can be found below:C4 B.Spoke Market Research Job Description.pdfPDF FileSo far, you’ve had a phone interview with a recruiter and you’ve secured a second interview with the B.Spoke team. The recruiter’s email can be found below:C4 S2 Email from Recruiter.pdfPDF FileYou arrive 15 minutes early for your interview. Soon, you are escorted into a conference room, where you meet Jodie Choi, the data science lead. After welcoming you, the behavioral interview begins.For your first question, your interviewer wants to learn about your experience with spreadsheets. She says: Sometimes the team needs data that is stored in different spreadsheets. So, we use a spreadsheet function to find the information we need.There is a spreadsheet function that allows a data analyst to search for a value in the first column of a given range and return the value of a specified cell in the row in which it is found. What function allows you to complete these tasks?COUNTIFSEARCHRETURNVLOOKUP Scenario 1, questions 1-5 You are a data analyst at a small analytics company. Your company is hosting a project kick-off meeting with a new client, Meer-Kitty Interior Design. The agenda includes reviewing their goals for the year, answering any questions, and discussing their available data. Before the meeting you review the About Us tab on their website and their business plan, linked below: Meer-Kitty Interior Design has two goals. They want to expand their online audience, which means getting their company and brand known by as many people as possible. They also want to launch a line of high-quality indoor paint to be sold in-store and online. You decide to consider the data about indoor paint first. To use the template for the survey feedback, click the link below and select “Use Template.” Link to template: Kitty Survey Feedback OR If you don’t have a Google account, download the file directly from the attachment below. When you refer to the Meer-Kitty survey feedback tab, you are pleased to find that the available data is aligned to the business objective. However, you do some research about confidence level for this type of survey and learn that you need at least 120 unique responses for the survey results to be useful. Therefore, the dataset has two limitations: First, there are only 40 responses; second, a Meer-Kitty superfan, User 588, completed the survey 11 times. As the survey has too few responses and numerous duplicates that are skewing results, what are your options? Select all that apply.Repeat the survey in order to create a new, improved dataset.Talk with stakeholders and ask for more time.Remove the duplicates from the data and proceed with analysis.Locate another dataset about indoor paint. Scenario 1 continued During the meeting, you also learn that Meer-Kitty videos are hosted on their website. For each product offered, there is an accompanying video for customers to learn more. So, more views for a video suggests greater consumer interest. Your goal is to identify which videos are most popular, so Meer-Kitty knows what topics to explore in the future. Unfortunately, Meer-Kitty has just three months of data available because they only recently launched the videos on their site. Without enough data to identify long-term trends about the video subjects that people prefer, what should you do?Tell the client you’re sorry, but there is no way to meet their objective.Find an alternate data source that will still enable you to meet your objective.Watch the videos and use your gut instinct to identify which are most successful.Move ahead with the data you have to determine the top video subjects. Scenario 1 continued Now that you’ve identified some limitations with Meer-Kitty’s data, you want to communicate your concerns to stakeholders. In addition to insufficient video trend data, your main concern with the indoor paint survey is that the data isn’t representative of the population as a whole. Clearly, one particular respondent, the superfan, is overrepresented. This is an example of margin of error.TrueFalse Scenario 1 continued The stakeholders understand your concerns and agree to repeat the indoor paint survey. In a few weeks, you have a much better dataset with more than 150 responses and no duplicates. To use the template for the survey feedback, click the link below and select “Use Template.” Link to template: Kitty Survey Feedback Or, if you don’t have a Google account, download the file directly from the attachment below. If you are using the template, please refer to the New Meer-Kitty survey feedback tab. You notice that questions 4 and 5 are dependent on the respondent’s answer to question 3. So, you need to determine how many people answered Yes to question 3, then compare that to responses to questions 4 and 5. That way, you will know if questions 4 and 5 have any nulls. You decide to use a spreadsheet tool that changes how cells appear when they contain the word Yes. Which tool do you use?FilteringData validationCONCATENATEConditional formatting Scenario 1, continued You have finished cleaning the data to ensure it is complete, correct, and relevant to the problem you’re trying to solve. Then, you complete the verification and reporting processes to share the details of your data-cleaning effort with your team. Your team notes one aspect of data cleaning that would help improve the dataset. They point out that the new survey also has a new question in Column G: “What are your favorite indoor paint colors?” This was a free-response question, so respondents typed in their answers. Some people included multiple different colors of paint. In order to determine which colors are most popular, it will be necessary to put each color in its own cell. You use a spreadsheet function to divide the text strings in Column G around the commas and put each fragment into a new, separate cell. In this example, what are the commas called?SubstringsMIDsDelimitersPartitions Scenario 2, questions 6-10 You’ve completed this program and are interviewing for a junior data scientist position. The job is at B.Spoke Market Research, a company that analyzes market conditions using customer surveys and other research methods. The detailed job description can be found below: So far, you’ve had a phone interview with a recruiter and you’ve secured a second interview with the B.Spoke team. The recruiter’s email can be found below: You arrive 15 minutes early for your interview. Soon, you are escorted into a conference room, where you meet Jodie Choi, the data science lead. After welcoming you, the behavioral interview begins. For your first question, your interviewer wants to learn about your experience with spreadsheets. She says: Sometimes the team needs data that is stored in different spreadsheets. So, we use spreadsheet functions to help us find the information we need. What function would you use to search for a certain value in a spreadsheet column to return the corresponding piece of information?RETURNSEARCHCOUNTIFVLOOKUP Scenario 2, continued Next, your interviewer wants to know more about your understanding of tools that work in both spreadsheets and SQL queries. She explains that the data her team receives from customer surveys sometimes has many duplicate entries. She says: Spreadsheets have a great tool for that called remove duplicates. But when writing a SQL query, what command should you include in your SELECT statement to remove duplicates?DIVERSEDIFFERENTDISCRETEDISTINCT Scenario 2, continued Now, your interviewer explains that the data team usually works with very large amounts of customer survey data. After receiving the data, they import it into a SQL table. But sometimes, the new dataset imports incorrectly and they need to change the format. She asks: What function would you use to convert data in a SQL table from one datatype to another?CASTCHANGECONVERSECOALESCE Scenario 2, continued Next, your interviewer explains that one of their clients is an online retailer that has a vast inventory. She has a list of items by name, color, and size. Then, she has another list of the price of each item by size, as a larger item sometimes costs more. The stakeholder needs one list of all items by name, color, size, and price. She then says: In situations such as this one, could you use the CONCAT function to add strings together to create new text strings?YesNo Scenario 2, continued For your final question, your interviewer explains that her team often comes across data with extra leading or trailing spaces. She asks: Which function would enable you to eliminate those extra spaces? You respond: To eliminate extra spaces for consistency, use the TRIM function.TrueFalse Shuffle Q/AScenario 1, questions 1-5 You are a data analyst at a small analytics company. Your company is hosting a project kick-off meeting with a new client, Meer-Kitty Interior Design. The agenda includes reviewing their goals for the year, answering any questions, and discussing their available data. Before the meeting you review the About Us tab on their website and their business plan, linked below: Meer-Kitty Interior Design has two goals. They want to expand their online presence, which means getting their company and brand known by as many people as possible. They also want to launch a line of high-quality indoor paint to be sold in-store and online. You decide to consider the data about indoor paint first. To use the template for the survey feedback, click the link below and select “Use Template.” Link to template: Kitty Survey Feedback OR If you don’t have a Google account, download the file directly from the attachment below. When you refer to the Meer-Kitty survey feedback tab, you are pleased to find that the available data is aligned to the business objective. However, you do some research about confidence level for this type of survey and learn that you need at least 120 unique responses for the survey results to be useful. Therefore, the dataset has two limitations: First, there are only 40 responses; second, a Meer-Kitty superfan, User 588, completed the survey 11 times. As the survey has too few responses and numerous duplicates that are skewing results, you should remove the duplicates and continue analyzing the remaining 29 responses.TrueFalse Scenario 1 continued During the meeting, you also learn that Meer-Kitty videos are hosted on their website. For each product offered, there is an accompanying video for customers to learn more. So, more views for a video suggests greater consumer interest. Your goal is to identify which videos are most popular, so Meer-Kitty knows what topics to explore in the future. Unfortunately, Meer-Kitty has just three months of data available because they only recently launched the videos on their site. Without enough data to identify long-term trends about the video subjects that people prefer, what are your available options? Select all that apply.Move ahead with the data you have to determine the top video subjects.Watch the videos and use your gut instinct to identify which are most successful.Ask to wait for more data and provide Meer-Kitty with an updated timeline.Talk with Meer-Kitty stakeholders and ask to adjust the objective. Scenario 1 continued The stakeholders understand your concerns and agree to repeat the indoor paint survey. In a few weeks, you have a much better dataset with more than 150 responses and no duplicates. To use the template for the survey feedback, click the link below and select “Use Template.” Link to template: Kitty Survey Feedback OR If you don’t have a Google account, download the file directly from the attachment below. If you are using the template, please refer to the New Meer-Kitty survey feedback tab. You notice that questions 4 and 5 are dependent on the respondent’s answer to question 3. So, you need to determine how many people answered Yes to question 3, then compare that to responses to questions 4 and 5. That way, you will know if questions 4 and 5 have any nulls. You decide to use a spreadsheet tool that changes how cells appear when they contain the word Yes. When using this tool, what is the word Yes?The value in a VLOOKUP statementThe value in a conditional formatting ruleThe value in a CONCATENATE rangeThe value in the COUNTA range Scenario 2, questions 6-10 You’ve completed this program and are interviewing for a junior data scientist position. The job is at B.Spoke Market Research, a company that analyzes market conditions using customer surveys and other research methods. The detailed job description can be found below: So far, you’ve had a phone interview with a recruiter and you’ve secured a second interview with the B.Spoke team. The recruiter’s email can be found below: You arrive 15 minutes early for your interview. Soon, you are escorted into a conference room, where you meet Jodie Choi, the data science lead. After welcoming you, the behavioral interview begins. For your first question, your interviewer wants to learn about your experience with spreadsheets. She says: Sometimes the team needs data that is stored in different spreadsheets. So, we use a spreadsheet function to find the information we need. There is a spreadsheet function that searches for a value in the first column of a given range and returns the value of a specified cell in the row in which it is found. It is called SEARCH.TrueFalse Scenario 2, continued Next, your interviewer wants to know more about your understanding of tools that work in both spreadsheets and SQL. She explains that the data her team receives from customer surveys sometimes has many duplicate entries. She says: Spreadsheets have a great tool for that called remove duplicates. Does this mean the team has to remove the duplicate data in a spreadsheet before transferring data to our database?YesNo Scenario 2, continued Now, your interviewer explains that the data team usually works with very large amounts of customer survey data. After receiving the data, they import it into a SQL table. But sometimes, the new dataset imports incorrectly and they need to change the format. She asks: Is there a SQL function that can convert data types such as currency, dates, and times in a SQL table?Yes, data types including currency, dates, and times can be converted.No, only currency can be converted. Scenario 2, continued Next, your interviewer explains that one of their clients is an online retailer that needs to create product numbers for a vast inventory. Her team does this by combining the text strings for product number, manufacturing date, and color. She asks: If you encountered a situation where you wanted to add strings together to create new text strings, which SQL function would you use?COMBINECOALESCECREATECONCAT Scenario 2, continued For your final question, your interviewer explains that her team often comes across data with extra leading or trailing spaces. She asks: Which SQL function enables you to eliminate those extra spaces for consistency?TRIMLENSUBSTRLENGTH Scenario 1 continued Now that you’ve identified some limitations with Meer-Kitty’s data, you want to communicate your concerns to stakeholders. In addition to insufficient video trend data, your main concern with the indoor paint survey is that the data isn’t representative of the population as a whole. Clearly, one particular respondent, the superfan, is overrepresented. This means the data doesn’t represent the population as a whole. When surveying people for Meer-Kitty in the future, what are some best practices you can use to address some of the issues associated with sampling bias? Select all that apply.Increase sample sizeUse data that keeps updatingUse data from only one sourceUse random sampling Scenario 1, continued You have finished cleaning the data to ensure it is complete, correct, and relevant to the problem you’re trying to solve. Then, you complete the verification and reporting processes to share the details of your data-cleaning effort with your team. Your team notes one aspect of data cleaning that would help improve the dataset. They point out that the new survey also has a new question in Column G: “What are your favorite indoor paint colors?” This was a free-response question, so respondents typed in their answers. Some people included multiple different colors of paint. In order to determine which colors are most popular, it will be necessary to put each color in its own cell. You decide to use a spreadsheet function to divide the text strings in Column G around the commas and put each fragment into a new, separate cell. You are using the SPLIT function.TrueFalse Scenario 2, continued Next, your interviewer wants to know more about your understanding of tools that work in both spreadsheets and SQL. She explains that the data her team receives from customer surveys sometimes has many duplicate entries. She says: Spreadsheets have a great tool for that called remove duplicates. In SQL, you can include DISTINCT to do the same thing. In which part of the SQL statement do you include DISTINCT?The UPDATE statementThe SELECT statementThe FROM statementThe WHERE statement Scenario 2, continued Now, your interviewer explains that the data team usually works with very large amounts of customer survey data. After receiving the data, they import it into a SQL table. But sometimes, the new dataset imports incorrectly and they need to change the format. She asks: Is there a command or function that converts data in a SQL table from one datatype to another? You respond: Yes, it’s the CAST function.TrueFalse Scenario 2, continued Next, your interviewer explains that one of their clients is an online retailer that has a vast inventory. She has a list of items by name, color, and size. Then, she has another list of the price of each item by size, as a larger item sometimes costs more. The client needs one list of all items by name, color, size, and price. She then asks: If you were to use the CONCAT function to complete this task, what would it enable you to do?Search for and return missing products in inventoryCreate a unique key to tell products apartClean the product identifier text stringsCreate a new product database table Scenario 2, continued For your final question, your interviewer explains that her team often uses the TRIM function when writing SQL queries. She asks: What is the TRIM function used for in SQL?To eliminate extra leading or trailing spacesTo return the smallest numeric value from a listTo shorten the list of resultsTo eliminate null values Scenario 1, questions 1-5 You are a data analyst at a small analytics company. Your company is hosting a project kick-off meeting with a new client, Meer-Kitty Interior Design. The agenda includes reviewing their goals for the year, answering any questions, and discussing their available data. Before the meeting you review the About Us tab on their website and their business plan, linked below: Meer-Kitty Interior Design has two goals. They want to expand their online audience, which means getting their company and brand known by as many people as possible. They also want to launch a line of high-quality indoor paint to be sold in-store and online. You decide to consider the data about indoor paint first. To use the template for the survey feedback, click the link below and select “Use Template.” Link to template: Kitty Survey Feedback OR If you don’t have a Google account, download the file directly from the attachment below. When you refer to the Meer-Kitty survey feedback tab, you are pleased to find that the available data is aligned to the business objective. However, you do some research about confidence level for this type of survey and learn that you need at least 120 unique responses for the survey results to be useful. Therefore, the dataset has two limitations: First, there are only 40 responses; second, a Meer-Kitty superfan, User 588, completed the survey 11 times. As the survey has too few responses and numerous duplicates that are skewing results, you decide to repeat the survey in order to create a new, improved dataset. What is your first step?Delete all of the data from the current, skewed survey.Write new, improved survey questions.Find a survey tool that only allows someone to complete the survey once.Talk with stakeholders, explain the new timeline, and ask for approval. Scenario 1 continued Now that you’ve identified some limitations with Meer-Kitty’s data, you want to communicate your concerns to stakeholders. In addition to insufficient video trend data, your main concern with the indoor paint survey is that the data isn’t representative of the population as a whole. Clearly, one particular respondent, the superfan, is overrepresented. What does this situation describe?Sampling biasStatistical significanceMargin of errorConfidence level Scenario 1 continued The stakeholders understand your concerns and agree to repeat the indoor paint survey. In a few weeks, you have a much better dataset with more than 150 responses and no duplicates. To use the template for the survey feedback, click the link below and select “Use Template.” Link to template: Kitty Survey Feedback OR If you don’t have a Google account, download the file directly from the attachment below. If you are using the template, please refer to the New Meer-Kitty survey feedback tab. You notice that questions 4 and 5 are dependent on the respondent’s answer to question 3. So, you need to determine how many people answered Yes to question 3, then compare that to responses to questions 4 and 5. That way, you will know if questions 4 and 5 have any nulls. You decide to use a spreadsheet tool that changes how cells appear when they meet a certain value — in this case, the word Yes. You are using VLOOKUP.TrueFalse Course 5 – Analyze Data to Answer Questions Week 1 – Organizing data to begin analysis Which of the following tasks would a data analyst perform during the analyze phase of the data analysis process? Select all that apply.Preparing a report for the stakeholdersGetting input from othersVisualizing the data with chartsOrganizing data into understandable sections A data analyst working on a dataset performs several calculations with the data. What phase of analysis is the analyst in?Transform dataOrganize dataGet input from othersFormat and adjust data A data analyst is sorting spreadsheet data. What tool should they use to make sure that the data across rows is kept together when they rearrange the data?Sort togetherSort sheetSort columnSort rows A data analyst sorts a spreadsheet range between cells A15 and G71. They sort in ascending order by the second column, Column B. What is the syntax they are using?=SORT(A15:G71, 2, FALSE)=SORT(A15:G71, 2, TRUE)=SORT(A15:G71, B, TRUE)=SORT(A15:G71, B, FALSE) What is the goal of the analysis phase of the data analysis process?To describe data structuresTo generate new dataTo identify trends and relationships in dataTo make generalizations about data During which of the four phases of analysis do you compare your data to external sources?Format and adjust dataTransform dataGet input from othersOrganize data Which of the following actions might occur when transforming data? Select all that apply.Identify a pattern in your dataMake calculations based on your dataRecognize relationships in your dataEliminate irrelevant info from your data Typically, a data analyst uses filters when they want to expand the amount of data they are working with.TrueFalse A data analyst is sorting data in a spreadsheet. They select a specific collection of cells in order to limit the sorting to just specified cells. Which spreadsheet tool are they using?Sort SheetSort RangeLimit SortLimit Range A data analyst sorts a spreadsheet range between cells D5 and M5. They sort in descending order by the third column, Column F. What is the syntax they are using?=SORT(D5:M5, C, TRUE)=SORT(D5:M5, 3, FALSE)=SORT(D5:M5, C, FALSE)=SORT(D5:M5, 3, TRUE) You are querying a database that contains data about music. Each musical genre is given an ID number. You are only interested in data related to the genre with ID number 7. The genre IDs are listed in the genre_id column. You write the SQL query below. Add a WHERE clause that will return only data about the genre with Id number 7.Who is the composer listed in row 4 of your query result?Caetano VelosoMarisa MonteLulu SantosGilberto Gil You are working with a database that contains invoice data about online music purchases. You are only interested in invoices sent to customers located in the city of Delhi. You want to sort the invoices by order total in ascending order. The order totals are listed in the total column. You write the SQL query below. Add an ORDER BY clause that will sort the invoices by order total in ascending order.What total appears in row 4 of your query result?​1.985.948.913.96 Shuffle Q/A Fill in the blank: The _____ phase of the data analysis process includes organizing data, formatting and adjusting data, getting input from others, and transforming data by observing relationships between data points and making calculations.processprepareanalyzeact During which of the four phases of analysis do you gather the relevant datasets into an usable structure for a project?Format and adjust dataGet input from othersTransform dataOrganize data Fill in the blank: Sorting ranks data based on a specific _____ that you select.calculationobservationmetricmodel A data analyst is sorting data in a spreadsheet. Which tool are they using if all of the data is sorted by the ranking of a specific sorted column and data across rows is kept together?Sort SheetSort TogetherSort RankSort Document A data analyst sorts a spreadsheet range between cells A1 and E50. They sort in descending order by the fourth column, Column D. What is the syntax they are using?=SORT(A1:E50, 4, FALSE)=SORT(A1:E50, 4, TRUE)=SORT(A1:E50, D, TRUE)=SORT(A1:E50, D, FALSE) You are querying a database that contains data about music. You are only interested in data related to the jazz musician Miles Davis. The names of the musicians are listed in the composer column.You write the following SQL query, but it is incorrect. What is wrong with the query?SELECT *FROM TrackWHERE composer = Miles DavisLine 3 should be rewritten as WHERE composer is Miles Davis.Composer in line 3 should be capitalized.SELECT, FROM, and WHERE should not be capitalized.Miles Davis should be in double quotation marks. You are working with a database that contains invoice data about online music purchases. You are only interested in invoices sent to customers located in the city of Paris. You want to sort the invoices by order total in ascending order. The order totals are listed in the total column. You write the SQL query below. However this query is incorrect. What is wrong with it? SELECT *FROM invoiceWHERE billing_city = “Paris”ORDER totalSELECT, FROM, WHERE, and ORDER are capitalized.Line 4 is missing the text column = between ORDER and total.In line 3, “Paris” has quotation marks.Line 4 is missing the word BY between ORDER and total. After collecting the relevant datasets for their analysis, a data analyst compares this data to external sources. In which of the four phases of analysis does this occur?Organize dataFormat and adjust dataTransform dataGet input from others A data analyst working on a data set is investigating possible relationships in the data. What phase of analysis is the analyst in?Format and adjust dataGet input from othersTransform dataOrganize data A data analyst sorts a spreadsheet range between cells K9 and L20. They sort in ascending order by the first column, Column K. What is the syntax they are using?=SORT(K9:L20, K, TRUE)=SORT(K9:L20, K, FALSE)=SORT(K9:L20, 1, TRUE)=SORT(K9:L20, 1, FALSE) You are querying a database that contains data about music. Each album is given an ID number. You are only interested in data related to the album with ID number 3. The album IDs are listed in the album_id column.You write the following SQL query, but it is incorrect. What is wrong with the query?SELECT *FROM TrackWHERE album = 3In line 3, album should be album_id.SELECT, FROM, and WHERE should be capitalized.In line 3, album is not capitalized.Line 3 contains an equal sign. In the data analysis process, which of the following refers to a phase of analysis? Select all that apply.Format data using sorts and filtersGet input from othersOrganize data into understandable sectionsVisualize the data A data analyst is collecting all the datasets that are relevant to their project. Which of the four phases of analysis is the data analyst in?Get input from othersOrganize dataFormat and adjust dataTransform data A data analyst investigating a data set is interested in showing only data that matches given criteria. What is this known as? SortingModelingMeasuringFiltering You are working with a database that contains invoice data about online music purchases. You are only interested in invoices sent to customers located in the city of Delhi. You want to sort the invoices by order total in ascending order. The order totals are listed in the total column.You write the SQL query below. However this query is incorrect. What is wrong with it?SELECT *FROM invoiceWHERE billing_city = “Delhi”ORDER BY order_totalSELECT, FROM, WHERE, and ORDER BY are capitalized.In line 4, order_total should be total.In line 3, “Delhi” has quotation marks.Line 4 contains the word BY. A data analyst chooses to rank the data based on a specific metric. What is the term for this action?SortingFilteringModelingMeasuring A data analyst investigates the data they’ve collected to look for patterns and relationships between the data. They also perform calculations based on the data. In which of the four phases of analysis does this occur?Format and adjust dataTransform dataGet input from othersOrganize data A data analyst working on a very large dataset decides to narrow the scope of the data that they are working with in order to make the analysis more manageable. What can they use to narrow the amount of data?ModelingSortingFilteringMeasuring A data analyst uses a function to sort a spreadsheet range between cells H1 and K65. They sort in ascending order by the first column, Column H. What is the syntax they are using?=SORT(H1:K65, 1, FALSE)=SORT(H1:K65, A, TRUE)=SORT(H1:K65, A, FALSE)=SORT(H1:K65, 1, TRUE) You are querying a database that contains data about music. Each musical genre is given an ID number. You are only interested in data related to the genre with ID number 2. The genre IDs are listed in the genre_id column. You write the following SQL query, but it is incorrect. What is wrong with the query? SELECT *FROM TrackWHERE composer = 2Line 3 contains an equal sign.Composer should be genre_id in line 3.Composer is not capitalized in line 3.SELECT, FROM, and WHERE are capitalized. You are performing a calculation during your analysis of a dataset. Which phase of analysis are you in?Get input from othersFormat and adjust dataOrganize dataTransform data A data analyst is sorting spreadsheet data. They use the spreadsheet tool Sort Sheet. What does this tool do?It sorts all of the data in a spreadsheet by a specific sorted column.It sorts all of the data in a spreadsheet by the ranking of a specific sorted row.It allows the analyst to sort by a specific sorted row.It allows the analyst to sort a specific selection of cells only. Week 2 – Formatting and adjusting data You are responsible for maintaining the integrity of a dataset. Multiple analysts are working with this spreadsheet. What spreadsheet tool can you use to ensure that accidental changes are not recorded in the data?Data validationFindPop-up menusConditional formatting You are working with a SQL database with tables for flight routes in Canada. The table contains one column with the names of the departure airports. A different column in the same table contains the names of the arrival airports. What function can you use in your query to combine the arrival and departure airport names into a new column?COMBINEGROUPCONCATJOIN You are querying a database of ice cream flavors to determine which stores are selling the most mint chip. For your project, you only need the first 80 records. What clause should you add to the following SQL query?SELECT flavors FROM ice_cream_table WHERE flavor = “mint_chip”LIMIT 80LIMIT ,80LIMIT = 80LIMIT_80 An analyst notes that the “160” in cell A9 is formatted as text, but it should be Australian dollars. What spreadsheet tool can help them select the right format?Format as DollarEXCHANGECURRENCYFormat as Currency You are creating a spreadsheet to help you with your job search. Every time you find an interesting job, you add it to the spreadsheet. Then, you want to indicate two possible options: Need to Apply or Applied. What spreadsheet tool will save you time by enabling you to create a dropdown list with Need to Apply and Applied as the possible options?Pop-up menusData validationFindConditional formatting You are using a spreadsheet to keep track of your newspaper subscriptions. You add color to indicate if a subscription is current or has expired. Which spreadsheet tool changes how cells appear when values meet each expiration date?Data validationCONVERTConditional formattingAdd color You are analyzing data about the capitals of different countries. In your SQL database, you have one column with the names of the countries and another column with the names of the capitals. What function can you use in your query to combine the countries and capitals into a new column?GROUPCONCATJOINCOMBINE You are querying a database of museums to determine which ones will have a sculpture exhibit this year. For your project, you only need the first 50 records. What clause should you add to the following SQL query? SELECT museumsFROM museum_tableWHERE exhibit = “sculpture”LIMIT 50LIMIT,50LIMIT = 50LIMIT_50 A data analyst is working with a spreadsheet that has very long text strings. Rather than counting the characters themselves to determine the number of characters they contain, what tool can they use?The MID functionThe CHAR functionThe LEN functionThe COUNT function Spreadsheet cell L6 contains the text string “Function”. To return the substring “Fun”, what is the correct syntax?=RIGHT(L6, 3)=LEFT(3,L6)=LEFT(L6, 3)=RIGHT(3,L6) When working with spreadsheets, data analysts can use the WHERE function to locate specific characters in a string.TrueFalse Shuffle Q/A An analyst has financial data that is formatted as Canadian dollars, but it should be formatted as U.S. dollars. What spreadsheet tool can help them select the right format?Format as DollarsFormat as NumberFormat as CurrencyFormat as Money You are preparing a project tracker spreadsheet. Next to each project task, you need to add the name of the team member responsible. What spreadsheet tool will save you time by enabling you to create a drop-down list with team members’ names as the possible options?FindConditional formattingPop-up menusData validation You are working with a SQL database that contains tables for the locations for a popular fast food restaurant. In this database, you have one column with the city location and another column with the state location for each restaurant. What function can you use in your query to combine the city and state into a new column?COMBINECONCATJOINGROUP Fill in the blank: A data analyst is working with a spreadsheet that has very long text strings. They use the LEN function to count the number of _____ in the text strings.substringscharactersvaluesfields Spreadsheet cell H8 contains the text string “Marketing”. To return the substring “market”, what is the correct syntax?=RIGHT(6,H8)=LEFT(H8, 6)=RIGHT(H8, 6)=LEFT(6,H8) You are querying a database of restaurant locations to determine how many fast food companies have restaurants located in Texas. For your project, you only need the first 20 records. What clause should you add to the following SQL query? SELECT fast_foodFROM restaurant_tableWHERE location = “Texas”LIMIT,20LIMIT_20LIMIT 20LIMIT = 20 A data analyst is working with a spreadsheet that has very long text strings. They use a function to count the number of characters in cell B9. What is the correct syntax of the function?=LEN(B9)=LEN(“B9”)=LEN(B,9)=LEN(B:B9) You are working with a data set that contains string data. Cell C4 contains the string “Oct 13, 2004”. What does the function FIND(“,”, C4) output?4687 An analyst notes that the “235” in cell B8 is formatted as text, but it should be Euros. What spreadsheet tool can help them select the right format?Format as EurosFormat as MoneyFormat as NumberFormat as Currency A utility company uses a spreadsheet to track the number of consecutive months each customer has paid their bill on time. They use a spreadsheet tool to apply color to the cells when the number of consecutive months is 12 or greater. What tool are they using?Data validationAdd colorCONVERTConditional formatting Spreadsheet cell F2 contains the text string “Dashboard”. To return the substring “board”, what is the correct syntax?=LEFT(5,F2)=LEFT(F2, 5)=RIGHT(5,F2)=RIGHT(F2, 5) You are using the FIND function to identify the position of the whitespace in the string in cell A6. Which of the following is the correct function syntax for this purpose?FIND(“_”, A6)FIND(A6, _ )FIND(A6, “ “)FIND(“ “, A6) You are analyzing employee data for your company. In your SQL database, you have one column with the first names of the employees and another column with their last names. What function can you use in your query to combine the employee first names and last names into a new column?CONCATCOMBINEJOINGROUP An analyst is working with a dataset of financial data. The data is formatted as U.S. dollars, and the analyst needs it to be in Japanese yen. What spreadsheet tool can help them select the right format?Format as CurrencyFormat as MoneyFormat as NumberFormat as Yen Which of the following are appropriate uses for a spreadsheet’s data validation tool? Select all that apply.Avoiding invalid inputs to functionsAdding drop down menus on cellsMerging two or more columns.Protecting structured data You are working with a spreadsheet that records the running time of various songs. What spreadsheet tool can you use to change how the cells appear when their value is less than 20 seconds?CONVERTData validationConditional formattingAdd color A data analyst wants to write a SQL query to combine data from two columns and into a new column. What function can they use?GROUPCONCATJOINCOMBINE Fill in the blank: When working with spreadsheets, data analysts can use the _____ function to locate specific characters in a string.IDENTIFYFROMWHEREFIND A data analyst at a symphony orchestra uses a spreadsheet to keep track of how many concerts require more than 80 musicians. They use a spreadsheet tool to change how cells appear when values equal 80 or more. What tool are they using?CONVERTAdd colorConditional formattingData validation A data analyst is working with a spreadsheet that has very long text strings. They use a function to count the number of characters in cell G11. What is the correct syntax of the function?=LEN(“G11”)=LEN(G11)=LEN(G,11)=LEN(G:G11) Spreadsheet cell C2 contains the text string “Deviation”. To return the substring “Dev”, what is the correct syntax?=LEFT(3,C2)=RIGHT(3,C2)=LEFT(C2, 3)=RIGHT(C2, 3) When working with spreadsheets, data analysts use the find function to locate specific characters in a string. Find is case-sensitive, so it’s necessary to input the substring exactly how it appears.TrueFalse Week 3 – Aggregating data for analysis When using VLOOKUP, there are some common limitations that data analysts should be aware of. One of these limitations is that VLOOKUP only returns the first match it finds, even if there are many possible matches within the column.TrueFalse Fill in the blank: Data aggregation involves creating a _____ collection of data that originally came from multiple sources.expandedlocalizedmodifiedsummarized A data analyst uses the SUM function to add together numbers from a spreadsheet. However, after getting a zero result, they realize the numbers are actually text. What function can they use to convert the text to a numeric value?VALUEFIGURECONVERTDIGIT When using VLOOKUP, there are some common limitations that data analysts should be aware of. One of these limitations is that VLOOKUP can only return a value from the data to the left of the matched value.TrueFalse Fill in the blank: When writing a function, a data analyst wraps a table array in dollar signs. This is an _____ , which is used to lock the array so rows and columns don’t change if the function is copied.accurate referenceabsolute referencearbitrary referenceauthentic reference The following is a selection from a spreadsheet:To search for the growth in population in Indonesia, what is the correct VLOOKUP syntax?=VLOOKUP(“Indonesia”, A2:C10, 3, false)=VLOOKUP(Indonesia, A2*C10, 3, false)=VLOOKUP(“Indonesia”, A2:C10, 2, false)=VLOOKUP(Indonesia, A2:C10, 2, false) An INNER JOIN is a function that returns records with matching values in two or more tables. An OUTER JOIN is a function that combines RIGHT and LEFT JOIN to return all matching records in both tables.TrueFalse A data analyst writes a query that asks a database to return the number of rows in a specified range. Which function do they use?COUNT DISTINCTRANGECOUNTRETURN RANGE< Fill in the blank: In an SQL statement, the _____ is the name of the segment that executes first. Select all that apply.central selectinner querycentral queryinner select Shuffle Q/AIn data analytics, what is the process of gathering data from multiple sources and combining it into a single, summarized collection?Data compositionData aggregationData groupingData mapping A data analyst is performing numerical calculations on the data in their spreadsheet. Ahead of these calculations, they use the VALUE function. Why might they do this?To get a list of all the distinct numbers in the dataTo convert the numbers in the data from text to numerical valuesTo sum up all the numbers in the spreadsheetTo find the average of all the numbers in the spreadsheet You create a function using data values from a specified array. You notice that it works correctly only some of the time. You verify that the function was used correctly and you ask a colleague for their input. They ask if you locked the data array. What does this mean? Select all that apply.The data array has been made an absolute reference.The columns in the array cannot be changed.The data is accessible with a password.The rows in the array cannot be changed. The following is a selection from a spreadsheet:To search for the population of Bangladesh, what is the correct VLOOKUP syntax?=VLOOKUP(“Bangladesh”, A2:B10, 3, false)=VLOOKUP(Bangladesh, A2:B10, 3, false)=VLOOKUP(“Bangladesh”, A2:B10, 2, false)=VLOOKUP(Bangladesh, A2*B10, 2, false) A data analyst writes a query in SQL with the RIGHT JOIN functionFROM fiction_tableRIGHT JOINbooks_tableWhat does this function do?It returns all the records in the fiction table and only the records from the books table with matching values.It returns all records in both the fiction table and the books table.It returns only the records with values that match from both tables.It returns all records in the books table and only the records from the fiction table with matching values. The COUNT DISTINCT function includes repeating values when returning values in a specified range.TrueFalse A data analyst writes a query in SQL. Inside this query, they have a second query. What is this second query called? Select all that apply.SubqueryCentral querySmaller queryNested query One of the limitations of the VLOOKUP function is that it can only search columns to the right of the column into which it is entered. What is another limitation of VLOOKUP?It will only return the last match it finds.It can only be used on numerical data.It can only be used with text data.It will only return the first match it finds. A data analyst wraps the data array for their function in dollar signs ($). What does this do? Select all that apply.It converts the data to currency.It makes it so that columns cannot be changed.It makes it so that rows cannot be changed.It creates an absolute reference. The following is a selection from a spreadsheet:To search for the population of Brazil, what is the correct VLOOKUP syntax?=VLOOKUP(“Brazil”, A2:B10, 2, false)=VLOOKUP(Brazil, A2:B10, 2, false)=VLOOKUP(Brazil, A2,B10, 3, false)=VLOOKUP(Brazil, A2:B10, 3, false) You are writing a query that contains the COUNT function. What should this query return?The number of rows in a specified rangeThe number of times the query has been runThe sum of all values in a specified rangeThe number of columns in a specified range A data analyst wants to be sure all of the numbers in a spreadsheet are numeric. What function should they use to convert text to numeric values?VALUEPROCESSCONVERTEXCHANGE The following is a selection from a spreadsheet:To search for the population of Pakistan, what is the correct VLOOKUP syntax?=VLOOKUP(Pakistan, A2*B10, 2, false)=VLOOKUP(Pakistan, A2:B10, 3, false)=VLOOKUP(“Pakistan”, A2:B10, 2, false)=VLOOKUP(“Pakistan”, A2:B10, 3, false) A data analyst writes the following query in SQL with the LEFT JOIN function: FROM music_tableLEFT JOINEntertainment_table What does this function do?It returns all records in the music table and only the records from the entertainment table with matching values.It returns only the records with values that match from both tables.It returns all the records in the entertainment table and only the record from the music table with matching values.It returns all records in both the music table and the entertainment table. When working with subqueries, which part of the query segment executes first?The inner queryThe smaller queryThe outer queryThe larger query In data analytics, what is data aggregation?The process of moving certain data points to a higher rank or position.The process of modifying data in order to make it suitable for analysis.The process of ensuring a company’s data is properly stored, managed, and maintained.The process of gathering data from multiple sources and combining it into a single, summarized collection. VLOOKUP can have problems when used on data values that have leading and trailing spaces. What function can be used to eliminate these spaces?TRIMNOSPACEVALUECUT You are using the VLOOKUP function in a specific column in your spreadsheet. You know that one of VLOOKUP’s limitations is that it can only search in columns to the right of the column into which it is entered. What can you do if you also want the function to search the data found to the left?Use VLOOKUP in the leftmost columnUse VLOOKUP in the rightmost columnCopy that data into new columns to the rightMake the data into an absolute reference A data analyst creates an absolute reference around a function array. What is the purpose of the absolute reference?To automatically change numeric values to currency valuesTo keep a function array consistent so rows and columns will automatically change if the function is copiedTo lock the function array so rows and columns don’t change if the function is copiedTo copy a function and apply it to all rows and columns When creating an SQL query, which JOIN clause returns all matching records in two or more database tables?OUTERINNERLEFTRIGHT A data analyst is working with data that has been collected over time and stored in different databases. What process must they perform if they are to calculate the statistics of this data?Data aggregationData mappingData groupingData composition A data analyst uses the TRIM function on their spreadsheet. Why might they do this?They plan to convert all numbers from text into numeric.VLOOKUP needs data values to have leading spaces.VLOOKUP needs data values to have trailing spaces.They plan to use VLOOKUP on the spreadsheet data. A data analyst uses an absolute reference to lock a function array so rows and columns don’t change if the function is copied. What symbol is used to create an absolute reference?Ampersand (&)Asterisk (*)Dollar sign ($)Hashtag (#) What are some of the advantages of using subqueries in SQL? Select all that apply.Subqueries can use special functions.The logic is easier to read and understand.All of the logic is in one place.The query processes more efficiently. The VALUE function converts a numeric value into a text string in a spreadsheet.TrueFalse A data analyst locks the rows and columns in their spreadsheet by wrapping their function’s data array in dollar signs ($). Why would they do this?To avoid incorrect calculations caused by changing the arraySo that the data auto deletes after the function is usedSo that other analysts’ functions can not access the arrayFeedback:To stop people from accessing sensitive information in the array Which of the following terms describe a subquery? Select all that apply.Nested queryInner selectInner querySmall query Week 4 – Performing data calculations A data analyst uses the following formula to calculate a new column in a SQL query. What best describes the result of the formula?(colA + colB) / colC = new_colcolB is subtracted from colA then the result is multiplied by colC.colB is added to colA then the result is multiplied by colC.colB is divided by colC then the result is added to colA.colB is added to colA then the result is divided by colC. A data analyst is working with a spreadsheet from a furniture company. To use the template for this spreadsheet, click the link below and select “Use Template.” Link to template: Sample Transaction Table. Or, if you don’t have a Google account, download the file directly from the attachment below. The syntax of which of the following formulas would allow the analyst to count purchase sizes of two or more?=COUNTIF(G2:G30, “>=2”)=COUNTIF(H2:H30, “>=2”)=SUMIF(H2:H30, “=4”)=SUMIF(G2:G30, “<=1”) You are working in a spreadsheet and use the SUMIF function in the formula below as part of your analysis. =SUMIF(A1:A25, ”<10”, C1:C25) Which part of this formula is the criteria or condition?”<10”=SUMIFC1:C25A1:A25 A data analyst is working in a spreadsheet and uses the SUMPRODUCT function in the formula below as part of their analysis. =SUMPRODUCT(A2:A10,B2:B10) How does the SUMPRODUCT function calculate the cell ranges identified in the parentheses?The analyst wants to figure out the value of all of the items in the spreadsheet. Which formula will calculate the total price of all of the items?It multiplies the values in the first range, then multiplies the values in the second range .It adds the values in the first range, then adds the values in the second range.It multiplies the ranges, then adds the sum of the products of the two ranges.It adds the ranges, then multiplies them by the last value in the second array. You create a pivot table in a spreadsheet containing movie data. To use the template for this spreadsheet, click the link below and select “Use Template.” Link to template: Movie Data Project. Or, if you don’t have a Google account, download the file directly from the attachment below. If you want to summarize the data using the AVERAGE function in the Values menu, which spreadsheet columns could you add data from? Select all that apply.Box Office RevenueMovie TitleGenreBudget A data analyst uses the following SQL query to perform basic calculations on their data. Which types of operators is the analyst using in this SQL query? Select all that apply.MultiplicationAdditionSubtractionDivision You are working with a database table that contains data about music. The table includes columns for track_id, track_name, composer, and milliseconds (duration of the music track). You are only interested in data about the classical musician Johann Sebastian Bach. You want to know the duration of each Bach track in seconds. You decide to divide milliseconds by 1000 to get the duration in seconds, and use the AS command to store the result in a new column called secs. Add a statement to your SQL query that calculates the duration in seconds for each track and stores it in a new column as secs. NOTE: The three dots (...) indicate where to add the statement.What is the duration in seconds of the track with Id number 3408?307120153193 You are working with a database table that contains data about music. The table includes columns for album_id and milliseconds (duration of the music tracks on each album). You want to find out the total duration for each album in milliseconds, and store the result in a new column named total_duration. You write the SQL query below. Add a GROUP BY clause that will group the data by album Id number.What is the total duration of the album with Id number 2?257252959711342562858088 You are working with a database table that contains invoice data. The table includes columns for billing_state, billing_country, and total. You want to know the average total price for the invoices billed to the state of Wisconsin. You decide to use the AVG function to find the average total, and use the AS command to store the result in a new column called average_total. Add a statement to your SQL query that calculates the average total and stores it in a new column as average_total. NOTE: The three dots (...) indicate where to add the statement.What is the average total for Wisconsin?5.545.786.085.37 Shuffle Q/AA data analyst wants to calculate the number of rows that have a value of “shipped”. Which function could they use?=MAX(G2:G30,”=shipped”)=SUM(G2:G30,”=shipped”)=COUNT(G2:G30,”=shipped”)=COUNTIF(G2:G30,”=shipped”) You are working in a spreadsheet and use the SUMIF function in the following formula as part of your analysis. =SUMIF(D2:D10,”>=50”,E2:E10) Which part of this formula indicates the range of values to be added? E2:E10>=50D2:D10=SUMIF You create a pivot table and want to add up the total of all cells for each row and column value in the pivot table. Which function in the values menu would you use to summarize the data?AVERAGESUMPRODUCTCOUNTA What column is set as a value in the following pivot table?DirectionDurationMAXDate In the following SQL query, which column is part of an addition operation that creates a new column?SELECTYes_Responses,No_Responses,Total_Surveys,Yes_Responses + No_Responses AS Responses_Per_SurveyFROMSurvey_1Total_SurveysResponses_Per_SurveyYes_ResponsesSurvey_1 What SQL operator is used to return the remainder of a division operation?/!=<>% What is the purpose of using data validation during your analysis process?To ensure that you are able to use every piece of data from your raw dataTo guarantee that all of your stakeholders will be happy with your resultsTo ensure that all data is complete, accurate, secure, and consistentTo guarantee that visualizations are visually pleasing What is the purpose of the <> operator in SQL?To add two valuesTo return the remainder of a division operationTo check if two values are not equalTo set a value equal to another What is a reason to use a temporary table instead of a standard table in SQL?A temporary table allows functions that are unavailable to standard tables.A temporary table calculates formulas using less memory than standard tables.A temporary table calculates formulas faster than standard tables.A temporary table allows analysts to repeatedly work with the same subset of data. Which of the following SQL queries adds a table into the database?SELECT * FROM table GROUP BY columnA ORDER BY columnB;CREATE TABLE my_table AS (SELECT * FROM other_table);SELECT * FROM table;WITH my_table AS (SELECT * FROM other_table WHERE x = 0); What is the purpose of using pivot tables?To multiply two arrays and add the resultsTo allow quick copying from one table to anotherTo view data in multiple ways to find insights and trendsTo allow the use of SQL in spreadsheets How many different columns have been added to the values section of the pivot table editor?3261 What SQL keyword is used to define a name for a calculated column?SELECTASFROMWITH A data analyst uses the following formula to calculate a new row in a SQL query. What best describes the result of the formula?(colA + colB) / colC = new_colcolB is added to colA then the result is multiplied by colC.colB is subtracted from colA then the result is multiplied by colC.colB is added to colA then the result is divided by colC.colB is divided by colC then the result is added to colA. What is the process of checking and rechecking the quality of your data so that it is complete, accurate, secure, and consistent?Data-driven developmentData visualizationData augmentationData validation A data analyst finds some data that seems inconsistent. What is the first thing they should do?Remove the inconsistent values.Convert the inconsistent values to JSON.Fill the odd values with filler values.Determine if the inconsistent values are valid. What is a reason to use a WITH AS clause in a SQL statement?The result is temporary.The result is a pivot table.The result calculates faster.The result is a visualization. Which of the following SQL statements can be used to create temporary tables in SQL?WITH my_table FROM (SELECT * FROM other_table);WITH my_table AS (SELECT * FROM other_table WHERE x = 0);CREATE TABLE my_table AS (SELECT * FROM other_table);SELECT * FROM table; A data analyst wants to calculate the number of rows that have a SKU value of “K102145”. Which function could they use?=COUNTIF(G2:G30,K102145)=COUNTIF(K102145=G2:G30)=COUNTIF(G2:G30,“=K102145”)=COUNTIF(G2:G30,“K102145”) A data analyst wants to use a single function to multiply two ranges and then add the multiplied values. What single function can they use to accomplish this?SUMSUMPRODUCTSUMIFSUMIFS Which values of Date and Direction are used to calculate the value 450 in the following pivot table?2/3 and Down2/4 and Up2/5 and Down2/4 and Down When writing custom calculations in SQL, what characters can be used to group calculations to change the order of calculation?Parentheses – ( )Curly Braces – { }Quotation Marks – “ “Square Brackets – [ ] A data analyst is trying to manually recalculate a column that was present in their dataset. They want to find rows where the values in their column do not match the values in the original column. Which of the following SQL clauses could they use?WHERE original_column !! recalcualted_columnWHERE original_column NOT EQUALS recalcualted_columnWHERE original_column <> recalcualted_columnWHERE original_column ~= recalcualted_column When working with a new dataset, how can you ensure that your data is valid?Personally collect all data that you use in your analysis.Manually check the calculations of calculated columns.Convert all data to JavaScript Object Notation (JSON).Fill in missing values with values that will favor your initial hypothesis. Which of the following statements about temporary tables is correct?They must be created using the WITH AS SQL clause.They must be created using the WITH AS SQL clause.They are declared by enclosing a FROM statement between ##.They are a special feature of BigQuery unavailable in other RDBMS. A data analyst wants to calculate the number of rows that have a value less than 150. Which function could they use?=COUNTIF(”<150”,G2:G30)=SUMIF(“<150”,G2:G30)=COUNTIF(G2:G30,”<150”)=SUMIF(G2:G30,“<150”) What is the purpose of the EXTRACT function in SQL?Calculate using data extracted from other tablesReturn a specific key-value pair from a JSON objectReturn a specific portion of a dateCalculate the mathematical extract operation Which portion of a pivot table do you change if you want to use a different calculation to combine the results?FilterColumnsValuesRows Which of the following statements about temporary tables is correct?They must be created using the WITH AS SQL clause.They are automatically deleted when the SQL database session ends.They are declared by enclosing a FROM statement between ##.They are a special feature of BigQuery unavailable in other RDBMS. Course challenge You notice that many cells in the city column, Column K, are missing a value. So, you use the zip codes to research the correct cities. Now, you want to add the cities to each donor’s row. However, you are concerned about making a mistake, such as a spelling typo.What spreadsheet tool allows you to control what can and cannot be entered in your worksheet in order to avoid typos?ListData validationVLOOKUPFind Your database contains people who live in many areas of Wyoming. However, it’s important to align your in-house data with the data from Food Justice Rock Springs. You also need to separate your data into the two lists: Donation_Form_List and Postcard_List. They will be based on each city’s distance from Rock Springs.What SQL function do you use to select all data from the Donation_Form_List organized by zip code?ORGANIZEORDER BYSEQUENCEARRANGE BY You finish cleaning your datasets, so you decide to review Tayen’s email one more time to make sure you completed the task fully. It’s a good thing you checked because you forgot to identify people who have served on the board of directors or board of trustees. She wants to write them a thank-you note, so you need to locate them in the database.To retrieve only those records that include people who have served on the board of trustees or on the board of directors, you use the WHERE function. Which of the following SQL queries would return the needed information?SELECT *FROM Donation_Form_ListWHERE Board_Member != 'True' OR Trustee != 'True'SELECT *FROM Donation_Form_ListWHERE Board_Member != 'True' AND Trustee != 'True'SELECT *FROM Donation_Form_ListWHERE Board_Member = 'True' OR Trustee = "True"SELECT *FROM Donation_Form_ListWHERE Board_Member = 'True' AND Trustee = 'True' Tayen informs you that she’s thinking about inviting anyone who donated at least $100 in 2018, as well. However, she only has five open spaces. She asks you to report how many people gave at least $100 so she can determine if they can also be invited to the event.Which spreadsheet function do you use to count how many donations of $100 or greater appear in Column O (Contributions 2018)?TOTALMAXSUMIFCOUNTIF Scenario 1, Questions 1-7 For the past six months, you have been working for a direct-mail marketing firm as a junior marketing analyst. Direct mail is advertising material sent to people through the mail. These people can be current or prospective customers, clients, or donors. Many charities depend on direct mail for financial support. Your company, Directly Dynamic, creates direct-mail pieces with its in-house staff of graphic designers, expert mail list services, and on-site printing. Your team has just been hired by a local nonprofit, Food Justice Rock Springs. The mission of Food Justice Rock Springs is to eliminate food deserts by establishing local gardens, providing mobile pantries, educating residents, and more. Click below to read the email from Tayen Bell, vice president of marketing and outreach. You begin by reviewing the dataset. To use the template for this dataset, click the link below and select “Use Template.” Link to template: Dynamic Dataset Or, if you don’t have a Google account, download the file directly from the attachment below. The client has asked you to send two separate mailings: one to people within 50 miles of Rock Springs; the other to anyone outside that area. So, to research each donor’s distance from the city, you first need to find out where all of these people live. You could scroll through 209 rows of data, but you know there is a more efficient way to organize the cities. Which of the following procedures will enable you to sort your spreadsheet by city (Column K) in ascending order? Select all that apply.Select A2-R210, then use the drop-down menu to Sort Sheet by Column K from A to ZUse the SORT function syntax: =SORT(A2:R210, 11, TRUE)Select A2-R210, then use the drop-down menu to Sort Range by Column K from A to ZUse the SORT function syntax: =SORT(A2:R210, K, TRUE) Scenario 1, continued You notice that many cells in the city column, Column K, are missing a value. So, you use the zip codes to research the correct cities. Now, you want to add the cities to each donor’s row. However, you are concerned about making a mistake, such as a spelling typo. Fill in the blank: To add drop-down lists to your worksheet with predetermined options for each city name, you decide to use _____.VLOOKUPthe find tooldata validationthe LIST function Scenario 1, continued Now, you decide to address Tayen’s request to include a handwritten note in the direct-mail piece for anyone who gave at least $100 last year. Which of the following spreadsheet tools will enable you to change how cells appear if they contain a value of $100 or more?Conditional formattingThe COUNTA functionThe MAX functionData validation Scenario 1, continued At this point, you notice that the information about state and zip code is in the same cell. However, your company’s mailing list software requires states to be on a separate line from zip codes. To move the 5-digit zip code in cell L2 into its own column, you use the function =LEFT(L2,5).TrueFalse Scenario 1, continued Next, you duplicate your dataset twice using the Sheet Menu. You rename the first sheet Donation Form List, and you remove the cities that are further than 50 miles from Rock Springs. You rename the second sheet Postcard List, and you remove the cities that are within 50 miles of Rock Springs. Then, you import these datasets into your company’s mailing list database. In a mailing list database, you create two tables: Donation_Form_List and Postcard_List. You decide to clean the Donation_Form_List first. Your company’s mailing list software requires units to be on the same line as street addresses. However, they are currently in two separate columns (street_address and unit). What portion of your SQL statement will instruct the database to combine these two columns into a new column called “address”?CONCAT(street_address to unit) AS addressJOIN(street_address, ” to “, unit) AS addressCONCAT(street_address, ” to “, unit) AS addressJOIN(street_address to unit) AS address Scenario 1, continued Your database contains people who live across Wyoming. However, it’s important to align your in-house data with the data from Food Justice Rock Springs. You also need to separate your data into the two lists: Donation_Form_List and Postcard_List. They will be based on each city’s distance from Rock Springs. The zip codes are in a column called zip_code. What query do you use to select all data from the Donation_Form_List organized by zip code? Scenario 1, continued You finish cleaning your datasets, so you decide to review Tayen’s email one more time to make sure you completed the task fully. It’s a good thing you checked because you forgot to identify people who have served on the board of directors or board of trustees. She wants to write them a thank-you note, so you need to locate them in the database. To retrieve only those records that include people who have served on the board of trustees or on the board of directors, you use the WHERE function. The syntax is:TrueFalse Scenario 2, Questions 8-13 Your company’s direct-mail campaign was very successful, and Food Justice Rock Springs has continued partnering with Directly Dynamic. One thing you’ve been working on is assigning all donors identification numbers. This will enable you to clean and organize the lists more effectively. Meanwhile, another team member has been creating a prospect list that contains data about people who have indicated interest in getting involved with Food Justice Rock Springs. These people are also assigned a unique ID. Now, you need to compare your donor list with the dataset in your database and collect certain data from both. What SQL function will return records with matching values in both tables?OUTER JOININNER JOINLEFT JOINRIGHT JOIN Scenario 2, continued Your next task is to identify the average contribution given by donors over the past two years. Tayen will use this information to set a donation minimum for inviting donors to an upcoming event. You have performed the calculations for 2019, so now you move on to 2020. To return average contributions in 2020 (contributions_2020), you use the AVG function. You use the following section of a SQL query to find this average and store it in the AvgLineTotal variable: AVG(contributions_2020) AS AvgLineTotalTrueFalse Scenario 2, continued Now that you provided her with the average donation amount, Tayen decides to invite 50 people to the grand opening of a new community garden. You return to your New Donor List spreadsheet to determine how much each donor gave in the past two years. You will use that information to identify the 50 top donors and invite them to the event. What is the correct syntax to add the contribution amounts in cells O2 and P2?=SUM(O2*P2)=SUM(O2/P2)=SUM(“O2,P2”)=SUM(O2,P2) Scenario 2, continued Tayen informs you that she’s thinking about inviting anyone who donated at least $100 in 2018, as well. However, she only has five open spaces. She asks you to report how many people gave at least $100 so she can determine if they can also be invited to the event. The correct syntax to count how many donations of $100 or greater appear in Column O is =SUMIF(O2:O210,">=100").TrueFalse Scenario 2, continued The community garden grand opening was a success. In addition to the 55 donors Food Justice Rock Springs invited, 20 other prospects attended the event. Now, Tayen wants to know more about the donations that came in from new prospects compared to the original donors. This SQL query can be used to identify the percentage of contributions from prospects compared to total donors:TrueFalse Scenario 2, continued Your team creates a highly effective prospects list for Food Justice Rock Springs. After a few months, many of these prospects become donors. Now, Tayen wants to know the top three cities in which these new donors live. She will use that information to determine if it’s still true that people who live closer to Rock Springs are more likely to donate. To retrieve the number of donors in each city, sorted high to low, you use the following query:TrueFalse Shuffle Q/AScenario 1, Questions 1-7 For the past six months, you have been working for a direct-mail marketing firm as a junior marketing analyst. Direct mail is advertising material sent to people through the mail. These people can be current or prospective customers, clients, or donors. Many charities depend on direct mail for financial support. Your company, Directly Dynamic, creates direct-mail pieces with its in-house staff of graphic designers, expert mail list services, and on-site printing. Your team has just been hired by a local nonprofit, Food Justice Rock Springs. The mission of Food Justice Rock Springs is to eliminate food deserts by establishing local gardens, providing mobile pantries, educating residents, and more. Click below to read the email from Tayen Bell, vice president of marketing and outreach. You begin by reviewing the dataset. To use the template for this dataset, click the link below and select “Use Template.” Link to template: Dynamic Dataset Or, if you don’t have a Google account, download the file directly from the attachment below. The client has asked you to send two separate mailings: one to people within 50 miles of Rock Springs; the other to anyone outside that area. So, to research each donor’s distance from the city, you first need to find out where all of these people live. You could scroll through 209 rows of data, but you know there is a more efficient way to organize the cities. Which of the following functions will enable you to sort your spreadsheet by city (Column K) in ascending order?=SORT(A2:R210, 11, TRUE)=SORT(A2:R210, K, ASC)=SORT(A2:R210, K, TRUE)=SORT(A2:R210, 11, ASC) Scenario 1, continued At this point, you notice that the information about state and zip code is in the same cell. However, your company’s mailing list software requires states to be on a separate line from zip codes. What function do you use to move the 5-digit zip code in cell L2 into its own column?=LEFT(L2,5)=RIGHT(5,L2)=LEFT(5,L2)=RIGHT(L2,5) Scenario 1, continued Next, you duplicate your dataset twice using the Sheet Menu. You rename the first sheet Donation Form List, and you remove the cities that are further than 50 miles from Rock Springs. You rename the second sheet Postcard List, and you remove the cities that are within 50 miles of Rock Springs. Then, you import these datasets into your company’s mailing list database. In a mailing list database, you create two tables: Donation_Form_List and Postcard_List. You decide to clean the Donation_Form_List first. Your company’s mailing list software requires units to be on the same line as street addresses. However, they are currently in two separate columns (street_address and unit). You use a SQL function to instruct the database to combine the two columns into a new column called “address.” The syntax is: JOIN(street_address, " to ", unit) as address.TrueFalse Scenario 1, continued You finish cleaning your datasets, so you decide to review Tayen’s email one more time to make sure you completed the task fully. It’s a good thing you checked because you forgot to identify people who have served on the board of directors or board of trustees. She wants to write them a thank-you note, so you need to locate them in the database. To retrieve only those records that include people who have served on the board of trustees or on the board of directors, what is the correct query? Scenario 2, Questions 8-13 Your company’s direct-mail campaign was very successful, and Food Justice Rock Springs has continued partnering with Directly Dynamic. One thing you’ve been working on is assigning all donors identification numbers. This will enable you to clean and organize the lists more effectively. Meanwhile, another team member has been creating a prospect list that contains data about people who have indicated interest in getting involved with Food Justice Rock Springs. These people are also assigned a unique ID. Now, you need to compare your donor list with the dataset in your database and collect certain data from both. What SQL function will return all records from the left table and only the matching records from the right?INNER JOINOUTER JOINLEFT JOINRIGHT JOIN Scenario 2, continued Your next task is to identify the average contribution given by donors over the past two years. Tayen will use this information to set a donation minimum for inviting donors to an upcoming event. You start with 2019. To return average contributions in 2019 (contributions_2019), you use the AVG function. What portion of your SQL statement will instruct the database to find this average and store it in the AvgLineTotal variable?AVG(“contributions_2019”) IN AvgLineTotalAVG(contributions_2019) AS AvgLineTotalAVG(“contributions_2019”) AS AvgLineTotalAVG(contributions_2019) = “AvgLineTotal” Scenario 2, continued Tayen informs you that she’s thinking about inviting anyone who donated at least $100 in 2018, as well. However, she only has five open spaces. She asks you to report how many people gave at least $100 so she can determine if they can also be invited to the event. What is the correct syntax to count how many donations of $100 or great appear in Column O?=COUNTIF(02:2010,”<=100”)=COUNTIF(O2:O210,”>=100″)=SUMIF(02:2010,”>=100”)=SUMIF(O2:2010,”>=100″) Scenario 2, continued The community garden grand opening was a success. In addition to the 55 donors Food Justice Rock Springs invited, 20 other prospects attended the event. Now, Tayen wants to know more about the donations that came in from new prospects compared to the original donors. Which SQL query can be used to calculate the percentage of contributions from prospects? Scenario 2, continued Your team creates a highly effective prospects list for Food Justice Rock Springs. After a few months, many of these prospects become donors. Now, Tayen wants to know the top three cities in which these new donors live. She will use that information to determine if it’s still true that people who live closer to Rock Springs are more likely to donate. What clause do you add to the following query to sort the donors in each city from high to low?ORDER BY CITY(DonorID) ASCORDER BY COUNT(DonorID) DESCORDER BY CITY(DonorID) DESCORDER BY COUNT(DonorID) ASC Scenario 1, Questions 1-7 For the past six months, you have been working for a direct-mail marketing firm as a junior marketing analyst. Direct mail is advertising material sent to people through the mail. These people can be current or prospective customers, clients, or donors. Many charities depend on direct mail for financial support. Your company, Directly Dynamic, creates direct-mail pieces with its in-house staff of graphic designers, expert mail list services, and on-site printing. Your team has just been hired by a local nonprofit, Food Justice Rock Springs. The mission of Food Justice Rock Springs is to eliminate food deserts by establishing local gardens, providing mobile pantries, educating residents, and more. Click below to read the email from Tayen Bell, vice president of marketing and outreach. You begin by reviewing the dataset. To use the template for this dataset, click the link below and select “Use Template.” Link to template: Dynamic Dataset Or, if you don’t have a Google account, download the file directly from the attachment below. The client has asked you to send two separate mailings: one to people within 50 miles of Rock Springs; the other to anyone outside that area. So, to research each donor’s distance from the city, you first need to find out where all of these people live. You could scroll through 209 rows of data, but you know there is a more efficient way to organize the cities. Which of the following tools will enable you to sort your spreadsheet by city (Column K) in ascending order?Sort Range by Column K from A to ZSort Range by Column K from Z to ASort Sheet by Column K from A to ZSort Sheet by Column K from Z to A Scenario 1, continued Now, you decide to address Tayen’s request to include a handwritten note in the direct-mail piece for anyone who gave at least $100 last year. Which of the following procedures will enable you to change how cells in your spreadsheet appear if they contain a value of $100 or more?Select Column M. Then, select Format > Conditional Formatting. Choose to format cells if they are greater than 100.Select Column M. Then, select Format > Conditional Formatting. Choose to format cells if text starts with 100.Select Column M. Then, select Format > Conditional Formatting. Choose to format cells if text contains 100.Select Column M. Then, select Format > Conditional Formatting. Choose to format cells if they are greater than or equal to 100. Scenario 1, continued Your database contains people who live in many areas of Wyoming. However, it’s important to align your in-house data with the data from Food Justice Rock Springs. You also need to separate your data into the two lists: Donation_Form_List and Postcard_List. They will be based on each city’s distance from Rock Springs. The zip codes are in a column called zip_code. To select all data from the Donation_Form_List organized by zip code, you use the ORDER BY function. The syntax is:TrueFalse Scenario 1, continued You finish cleaning your datasets, so you decide to review Tayen’s email one more time to make sure you completed the task fully. It’s a good thing you checked because you forgot to identify people who have served on the board of directors or board of trustees. She wants to write them a thank-you note, so you need to locate them in the database. To retrieve only those records that include people who have served on the board of trustees or on the board of directors, what clause do you include in your query?WHERE Board_Member = “TRUE” AND Trustee = “TRUE”WHERE Board_Member = “TRUE” OR Trustee = “TRUE”WHERE Board_Member = TRUE AND Trustee = TRUEWHERE Board_Member = TRUE, Trustee = TRUE Scenario 2, continued Your team creates a highly effective prospects list for Food Justice Rock Springs. After a few months, many of these prospects become donors. Now, Tayen wants to know the top three cities in which these new donors live. She will use that information to determine if it’s still true that people who live closer to Rock Springs are more likely to donate. Which SQL query will retrieve the number of donors in each city, sorted high to low? Scenario 1, continued At this point, you notice that the information about state and zip code is in the same cell. However, your company’s mailing list software requires states to be on a separate line from zip codes. What function will enable you to move the 2-character state abbreviation in cell L2 into its own column?=RIGHT(L2,2)=RIGHT(2,L2)=LEFT(2,L2)=LEFT(L2,2) Course 6 – Share Data Through the Art of Visualization Week 1 – Visualizing data A data analyst notices that two variables in their data seem to rise and fall at the same time. They recognize that these variables are related somehow. What is this an example of?CorrelationVisualizationCausationTabulation A data analyst adds labels to their line graph to make it easier to read, even though they already have a legend on their visualizations. How does labeling the data make it more accessible?Labeling gives the same information as the legend.Labeling adds contrast to a visualization.Labeling does not depend on interpreting colors.Labeling hides unnecessary information. You are going to give a presentation to a broad audience. How can you make sure your visualizations are accessible to all members of the audience? Select two that apply.Include a lot of text in the visualizationMinimize contrast between colorsLabel data directly when possibleProvide text alternatives A data analyst wants to create a visualization that demonstrates how often data values fall into certain ranges. What type of data visualization should they use?Line graphScatter plotHistogramCorrelation chart What do correlation charts reveal about the data they contain?CausationRelationshipsChangesVisualization You are creating a presentation for stakeholders and are choosing whether to include static or dynamic visualizations. Describe the difference between static and dynamic visualizations.Static visualizations are interactive and can automatically change over time. Dynamic visualizations do not change over time unless they’re edited.Static visualizations do not change over time unless they’re edited. Dynamic visualizations are interactive and can automatically change over time.Static visualizations combine multiple visualizations into a whole. Dynamic visualizations separate out the individual elements of a single visualization.Static visualizations separate out the individual elements of a single visualization. Dynamic visualizations combine multiple visualizations into a whole. Sophisticated use of contrast helps separate the most important data from the rest using the visual context that our brains naturally respond to.TrueFalse Design thinking is a process used to solve complex problems in a visually appealing way.TrueFalse Fill in the blank: During the _____ phase of the design process, you start to generate data visualization ideas.empathizeideatetestdefine A data analyst adds labels to their line graph to make it easier to read even though they already have a legend on their visualizations. How does labeling the data make it more accessible?Labeling doesn’t depend on interpreting colorsLabelling adds contrast to a visualizationLabeling creates more visual interestLabeling helps redirect focus from outliers Fill in the blank: You should distinguish elements of your data visualization by _____ the foreground and background and using contrasting colors and shapes. This makes the content more accessible.highlightingseparatingoverlappingaligning Shuffle Q/AA data analyst working for an e-commerce website creates the following data visualization to present the amount of time users spend on the site:What type of visualization is this?Correlation chartHistogramLine graphScatterplot A data analyst is creating a chart for a presentation. The data they will display shows a correlation between variables. Why should they be careful when presenting their chart to an audience?Correlation can be misunderstood as causation.Correlation causes accessibility issues.Correlation should be avoided in charts.Correlation can only be represented in bar charts. What type of data visualizations allow users to have some control over what they see?Aesthetic visualizationsDynamic visualizationsGeometric visualizationsStatic visualizations Design thinking is a process used to solve problems in a user-centric way.TrueFalse During which phase of the design process do you try to understand the emotions and needs of your target audience?PrototypeIdeateTestEmpathize A data analyst wants to make their visualizations more accessible by adding text explanations directly on the visualization. What is this called?DistinguishingSubtitlingLabelingSimplifying What should data analysts do to make presentations more accessible for people who are blind and people with low vision?Minimize contrast between colorsRemove labels from dataProvide text alternativesAvoid using shapes and patterns to differentiate data You need to create a chart that displays the number of data records in each age group of a dataset. What type of chart would best represent this data?Histogram ChartRanked Bar ChartCorrelation ChartTime Series Chart Which of the following is generally good practice when using bar charts?Display the bars in ranked orderMake the gaps wider than the bars.Design bar charts with a single color.Avoid stacked bar charts. What are the key elements of effective visualizations you should focus on when creating data visualizations? Select all that apply.Clear meaningSophisticated use of contrastVisual formRefined execution Fill in the blank: Design thinking is a process used to solve complex problems _____.as quickly as possiblein a user-centric wayusing a set order of processeswith minimal user input Fill in the blank: A data analyst can make their visualizations more accessible by adding _____, which are text explanations placed directly on the visualizations.labelslegendscalloutssubheadings Distinguishing elements of your data visualizations makes the content easier to see. This can help make them more accessible for audience members with visual impairments. What are some methods data analysts use to distinguish elements?Ensure all elements are highlighted equallySeparate the foreground and backgroundUse similar colors and shapesAdd a legend You need to create a chart that explores how temperature changes throughout the year. What type of chart would best represent this data?Correlation ChartTime Series ChartHistogramRanked Bar Chart What type of visualizations give you the most control over the story you want to tell with your data?Static visualizationsDynamic visualizationsAesthetic visualizationsGeometric visualizations Fill in the blank: When choosing a chart you should choose the one that _____.makes use of the most modern visualization tooluses the least number of visual elements like size and shapeuses as many visual elements like size and shape as possiblemakes it easiest to understand the point you are trying to make A data analyst is designing a chart. They decide to use colors that make sense to their audience. What phase of creating data visualizations does this describe?Test PhaseIdeate PhasePrototype PhaseEmpathize Phase During which phase of the design process do you start to generate data visualization ideas?IdeateTestEmpathizeDefine What should you include in the headline of a data visualization?AbbreviationsClear languageAcronymsFancy typography A data analyst is making their data visualization more accessible. They separate the background and the foreground of the visualization using bright, contrasting colors. What does this describe?LabellingText alternativesDistinguishingText-based format Causation occurs when an action directly leads to an outcome.TrueFalse What type of charts are effective for presenting the composition of data? Select all the apply.Pie chartLine chartTree mapHeat map When using design thinking, what group of people should you think about the most?The general publicYour teamThe shareholdersYour users You are in the ideate phase of the design process. What are you doing at this stage?Making changes to their data visualizationGenerating visualization ideasCreating data visualizationsSharing data visualizations with a test audience Where is the best place to put labels that describe the meaning of individual data elements in a data visualization?Left of the chart areaIn the legendIn the dataBelow the chart area Fill in the blank: A data analyst creates a presentation for stakeholders. They include _____ visualizations because they don’t want the visualizations to change unless they choose to edit them.aestheticdynamicstaticgeometric While creating a chart to share their findings, a data analyst uses the color red to make important data stand out and separate it from the rest of the visualization. Which element of effective visualization does this describe?Refined executionClear meaningSophisticated use of contrastSubtitles You are in the process of creating data visualizations. You have considered the goal, the audience's needs, and come up with an idea. Next, you will share the visualization with peers. What phase of the design process will you be in?IdeateDefineTestEmpathize What text element in a visualization should be placed above the chart and clearly state what data is being presented?HeadlineLabelAnnotationSubtitle How much data should you represent when designing an effective data visualization?Include a subset of the data that your audience will likeOnly represent data that supports your initial hypothesisInclude all of the data from your analysis to ensure that your data visualization is complete and accurateOnly represent data the audience needs to understand your findings, unless it is misleading Week 2 – Creating data visualizations with Tableau A data analyst is using the Color tool in Tableau to apply a color scheme to a data visualization. They want the visualization to be accessible for people with color vision deficiencies, so they use a color scheme with lots of contrast. What does it mean to have contrast?The color scheme is graphically pleasing.The color scheme uses a range of different colors.The color scheme is uniform.The color scheme is monotone. You are working with the World Happiness data in Tableau. What tool do you use to change your point of view of Greece?LassoPanRectangularRadial Tableau is used to create interactive and dynamic visualizations. A visualization is interactive when the audience can control what data they see. What does it mean for a visualization to be dynamic?The visualization can change over timeThe visualization cannot be alteredThe visualization can be downloadedThe visualization can include audio A data analyst uses the Color tool in Tableau to apply a color scheme to a data visualization. In order to make the visualization accessible for people with color vision deficiencies, what should they do next?Make sure the color scheme has contrastMake sure the color scheme is uniformMake sure the color scheme uses only one color, in various shadesMake sure the color scheme is stylish You are working with the World Happiness data in Tableau. What tool do you use to select the area on the map representing Central America?RadialLassoPanRectangular Fill in the blank: A data analyst is working with the World Happiness data in Tableau. To get a better view of Moldova, they use the _____ tool.LassoPanRectangularRadial You are using the Label tool in Tableau. What will it enable you to do with the World Happiness map visualizations?Separate out a selected country on the mapHide certain countries on the mapDisplay the population of each country on the mapIncrease the size of a country on the map You are working with the World Happiness data in Tableau. Which tool will enable you to show certain data while hiding the rest?FormatDimensionFilterAttribute By default, all visualizations you create using Tableau Public are available to other users. What icon do you click to hide a visualization?SourceClosePrivateEye Fill in the blank: In Tableau, a diverging palette displays two value ranges. It uses a color to show the range where a data point is from and color intensity to show its ______.magnitudepurposeoriginationattributes Shuffle Q/ATableau is used to create dynamic and interactive visualizations. Dynamic visualizations can change over time. What does it mean for a visualization to be interactive?The audience can export the datasetsThe audience can control what data they seeThe audience can listen to audio about the dataThe audience can collaborate on changes to the data A data analyst uses the Color tool in Tableau to apply a color scheme to a data visualization. Why do they make sure the color scheme has contrast?To make the visualization more stylish for users to enjoyTo make the visualization more elaborateTo make the visualization uniformTo make the visualization accessible for people with color vision deficiencies A data analyst is working with the World Happiness data in Tableau. What tool do they use to select the area on the map representing Finland?PanRectangularRadialLasso A data analyst is using the Pan tool in Tableau. What are they doing?Rotating the perspective while keeping a certain object in viewTaking a screenshot of the visualizationCopying a data point to a second location in the visualizationDeselecting a data point from within the visualization Fill in the blank: In Tableau, the Label tool is located on the _____ shelf.pagescolumnsrowsmarks A data analyst is giving a presentation with the World Happiness data in Tableau. Their insights focus only on those countries with a happiness score greater than 4.5. What tool can they use to show only those countries while hiding the rest?FormatFilterAttributeDimension An analyst working in Tableau uses color to show the range where a data point is from and intensity to show its magnitude. What is this called?Value overlayDiverging paletteColor attributeConditional formatting Fill in the blank: In Tableau, a _____ visualization is one that can change over time.interpretivedynamicinteractivesensitive You are designing a visualization in Tableau and you want to ensure it is accessible. What can you apply with the Color tool in Tableau to make your visualization accessible for people with color vision deficiencies?palettefilteringvariationcontrast You are working with the World Happiness data in Tableau. What tool do you use to change your point of view of Italy?RadialRectangularPanLasso A data analyst working with the World Happiness data in Tableau displays the populations of each country in their visualization. What tool did they use?DetailTooltipSizeLabel A data analyst is creating a visualization in Tableau Public. They want to keep it private from other users until it is complete. Which icon should they click?SourceClosePrivateEye Fill in the blank: When using Tableau, people can control what data they see in a visualization. This is an example of Tableau being _____.indefinableinterpretiveinanimateinteractive A data analyst working with the World Happiness data in Tableau is only interested in those countries that have a happiness score of less than 3.5. What tool can they use to only show these countries?DimensionAttributeFormatFilter A data analyst is creating a visualization in Tableau public. Before they began, they clicked on the eye icon. What is the purpose for this?It hides the visualization from other users.It generates a new copy of the visualization.It saves the visualization.It gives access to Tableau’s options. Fill in the blank: In Tableau, a _____ palette displays two value ranges. Color shows the range where a data point is from and color intensity shows its magnitude.divergingoverlayinginvertingcontrasting What could a data analyst do with the Lasso tool in Tableau?Zoom in on a data pointMove a data pointSelect a data pointPan across data points Fill in the blank: In Tableau public, the _____ icon will hide your visualization from other users.closeeyesourceprivate Fill in the blank: In Tableau, a diverging palette displays two value ranges. It uses a color to show the range where a data point is from and _____ to show its magnitude.markersbordersintensitycolor overlays A data analyst creates a visualization in Tableau that allows the audience to change what data they want to see. What is such a visualization called?indefinablestaticinteractivecombo A data analyst creates a visualization with lots of contrast so that it is accessible for people with color vision deficiencies. What tool in Tableau does this?Color toolContrast toolPan toolLasso tool You are working with the World Happiness data in Tableau. You use the pan tool on the country of Japan. What is the result?It changes your point of view to Japan.It selects Japan.It filters Japan so it cannot be seen.It applies the current color scheme to Japan. Fill in the blank: A data analyst is working with the World Happiness data in Tableau. They use the _____ tool on the Marks shelf to display the population of each country on the map.sizedetaillabeltooltip What tool could you use in Tableau to show only those countries with a World Happiness score of 4.0 or less?AttributeFormatFilterDimension Fill in the blank: In Tableau, a(n) _____ visualization is one in which the audience can change what data they see.staticinteractivecomboindefinable Fill in the blank: You are creating a visualization with the World Happiness data from Tableau. With the Label tool, you can display the _____ of a specific attribute for each country on the visualizationcolorlocationtruthvalue A data analyst creates a visualization in Tableau showing their company’s quarterly sales data. They color all the items that have made a profit green and those in which they have a loss red. In addition, they intensify the color based on the magnitude of the profit or loss. What tool are they using?Diverging paletteValue overlayConditional formattingColor attribute Week 3 – Crafting data stories You are preparing to communicate to an audience about an analysis project. You consider the roles that your audience members play and their stake in the project. What aspect of data storytelling does this scenario describe?ThemeEngagementTakeawaysDiscussion A data analyst wants to communicate to others about their analysis. They ensure the communication has a beginning, a middle, and an end. Then, they confirm that it clearly explains important insights from their analysis. What aspect of data storytelling does this scenario describe?SpotlightingTakeawaysNarrativeSetting A data analyst prepares to communicate to an audience about an analysis project. They consider what the audience members hope to do with the data insights. This describes establishing the setting.TrueFalse When designing a dashboard, how can data analysts ensure that charts and graphs are most effective? Select all that apply.Incorporate all of the data points from the analysisMake good use of available spacePlace them in a balanced layoutInclude as many visual elements as possible What are the key differences between tiled and floating items in Tableau?Tiled items create a single-layer grid that contains no overlapping elements; floating items can be layered over other objects.Tiled items are connected by straight lines; floating items are unconnected.Tiled items always have a square layout; floating items are always based on circles.Tiled items can be layered over other objects; floating items create a single-layer grid that contains no overlapping elements. A data analyst creates a scatter plot in Tableau and notices an outlier. What should they do next?Use a filter to highlight the outlier, as it is more important than the rest of the dataInvestigate the outlier to determine if it can lead to any important observationsShift the outlier to the center of the other data points for conformityRemove the outlier, as it is unlikely to lead to any important observations You are creating a dashboard in Tableau to share with stakeholders. Why might you decide to pre-filter the dashboard? Select all that apply.To eliminate data points that do not support your conclusionsTo save stakeholders the effort of filtering the dashboard themselvesTo save stakeholders time in finding important dataTo direct stakeholders to important data Fill in the blank: A data analyst is creating the title slide in a presentation. The data they are sharing is likely to change over time, so they include the _____ on the title slide. This adds important context.key findings of the presentationname of the data sourcedata analysts involved in the projectdate of the presentation A data analyst wants to include a visual in their slideshow, then make some changes to it. Which of the following options will enable the analyst to edit the visual within the presentation without affecting its original file? Select all that apply.Connect the original visual to the presentation via its URLCopy and paste the visual into the presentationEmbed the visual into the presentationLink the original visual within the presentation Shuffle Q/AFill in the blank: A data-storytelling narrative draws a connection between the data and the specific _____ of the project.stakeholderstasksobjectivesmanagers A data analyst scans the data to quickly identify the most important insights. This describes spotlighting.TrueFalse Fill in the blank: An important part of dashboard design is the placement of charts, graphs, and other visual elements. They should be _____, which means that they are balanced and make good use of available space.constantcompletecleancohesive Fill in the blank: In Tableau, _____ items create a single-layer grid that contains no overlapping elements.fixedlayeredtiledfloating While preparing a presentation, you decide to limit the number of lines and words on each slide. This will help keep your audience attentive to what you are saying rather than focusing on reading slides. What is the greatest number of lines and words you should use on each slide?2 lines and 15 words10 lines and 100 words5 lines and 25 words3 lines and 10 words You are creating a slideshow for a client presentation. There is a pivot table in a spreadsheet that you want to include. In order for the pivot table to update whenever the spreadsheet source file changes, how should you incorporate it into your slideshow? Select all that apply.Copy and paste the pivot tableInsert a PDF of the pivot tableLink the pivot tableEmbed the pivot table A data analyst wants to tell a story with data. As a second step, they focus on showing the story of the data to highlight the meaning behind the numbers. Which step of data storytelling does this describe?Assemble word cloudsCreate compelling visualsEngage your audienceTell an interesting narrative Which of the following questions do data analysts ask to make sure they will engage their audience? Select all that apply.What information will convince the audience that my opinion is correct?What roles do the people in this audience play?What does the audience hope to do with the data insights?What is the audience’s stake in the project? A data analyst links their visualizations to external spreadsheets containing the data being described. What is the purpose for doing this?It allows changes made to the spreadsheet data that will not change the visualizations.It allows for the creation of multiple visualizations using the same dataset.It allows the visualization to be edited without the spreadsheet data being affected.It allows changes made to the spreadsheet data to automatically reflect in the visualizations. What three key components are required in a data storytelling narrative?Stakeholders, analysts, and customersSpotlighting, setting, and takeawaysMeasurement, data, and analysisBeginning, middle, and end You are designing a dashboard in Tableau. You choose a layout that allows objects to be layered over other items in the dashboard. What type of layout is this?TiledVerticalHorizontalFloating On a scatterplot, what is the term for a point that lies far from the rest of the points?An errorA filterAn outlierAn anomaly A data analyst wants to save stakeholders time and effort when working with a Tableau dashboard. They also want to direct stakeholders to the most important data. What process can they use to achieve both goals?Pre-filteringPre-sortingPre-sizingPre-buildingFill in the blank: An effective slideshow guides your audience through your main communication points, but it does not repeat every word you say. A best practice is to keep text to fewer than _____ lines and 25 words per slide.210515 A data analyst has multiple points to show with the same visualization. What should they do to communicate these points effectively to their audience?Save some of the points to use in another presentationCreate a new visualization for each point they need to makeLimit the number of points to only a few that are the most relevantIdentify each point on the same visualization using arrows A data analyst wants to tell a story with data. As a first step, they consider who will be listening to the data story and focus on capturing and holding their audience's interest. Which step of data storytelling does this describe?Assemble word cloudsTell an interesting narrativeEngage your audienceCreate compelling visuals You are preparing to communicate to an audience about an analysis project. You know that audience engagement is a crucial part of getting them to listen to what you have to say. You compile a list of the insights from your work and review it to identify both the key takeaways and the details that are less relevant. What process does this describe?NarrativeSpotlightingDiscussionTakeaways A data analyst is designing a dashboard. They make sure that the charts, graphs, and other visual elements are balanced and make good use of available space. What dashboard best practice does this describe?DetailCompletenessCohesionLabeling You are sharing your Tableau dashboard with stakeholders. What process can you implement so the stakeholders do not need to filter the dashboard themselves?Pre-sizingPre-filteringPre-filteringPre-building You want to include a visual in your slideshow that will update automatically when its original source file updates. Which of the following actions will enable you to do so?Copy and paste the visual into the presentationTake a screenshot of the visual and paste it into the presentationLink the original visual within the presentationEmbed the visual into the presentation An analyst is designing a dashboard. In order for it to be effective, they make sure that the charts, graphs, and other visual elements are balanced. What else should they do to make the dashboard design cohesive?Fill it with color.Make good use of space.Put in lots of detail.Make sure the dashboard is complete. Fill in the blank: When a data analyst notices a data point that is very different from the norm in a scatterplot, the best course of action is to _____ the outlier.investigatemoveremovehide You are working on a huge dataset and visualizing your data with Tableau. As a next step, you want to focus on only the data that is most important. Which Tableau tool can you use to limit the data displayed on the dashboard?Pre-filteringPre-buildingPre-sortingPre-sizing Fill in the blank: An effective slideshow guides your audience through your main communication points, but it does not repeat every word you say. A best practice is to keep text to fewer than five lines and _____ words per slide.50510025 A data analyst embeds their visualizations in their slideshow. These visualizations are based on data contained in external spreadsheets. Why might the analyst do this rather than copy and pasting the visualization?Subsequent changes made to the spreadsheet data will automatically be reflected in the slideshow.The visualizations will remain with the spreadsheet file instead of the presentation.Subsequent changes made to the spreadsheet data will not affect the visualization.The visualizations can be edited directly in the slideshow. Week 4 – Developing presentations and slideshows You are presenting your theory about the correlation between recent sales increases and a current pop culture trend. When is the best time to establish your presentation’s hypothesis for the audience?During the introductionBefore the conclusionDuring the conclusionBefore the presentation A data analyst gives a presentation about predicting upcoming investment opportunities. How does establishing a hypothesis help the audience understand their predictions?It describes the data thoroughlyIt summarizes the findings succinctlyIt visualizes the data clearly and conciselyIt provides context about the presentation’s purpose According to the McCandless Method, what is the most effective way to first present a data visualization to an audience?Answer obvious questions before they’re askedState the insight of the graphicTell the audience why the graphic mattersIntroduce the graphic by name You are preparing for your first presentation at a new job. Which strategies can help you combat nervousness about presentations? Select all that apply.Improvise your material to speak naturallyPractice and prepare your materialDo breathing exercises to calm your body downChannel your nervousness into excitement about your topic You are preparing for a presentation and want to make sure your nerves don’t distract you from your presentation. Which practices can help you stay focused on an audience? Select three that apply.Speak as quickly and briefly as possibleUse short sentencesKeep the pitch of your voice levelBe mindful of nervous habits You are running a colleague test with your coworkers. One coworker points out that she doesn’t understand one of your graphs. What can you do to prepare for presenting to your stakeholders? Select all that apply.Redesign the graphElaborate on the data from the graphMove the graph to a later slideRemove the graph Your stakeholders express concern that the results of your analysis are very different from the predictions they made last year. Which kind of objection are they making?DataAnalysisPresentation skillsFindings A stakeholder objects to the steps of your analysis. What are some appropriate ways to respond to this objection? Select all that apply.Explain why you think any discrepancies existTake steps to investigate your analysis question furtherCommunicate the assumptions you made in your analysisDefend the results of your analysis You notice that your audience is not as engaged as you’d like during your Q&A. Which of the following are ways to get them more involved?Keep your pitch levelRepeat your key findingsWait longer for the audience to ask questionsAsk them for insights Shuffle Q/AA purchaser at your company wants to optimize the price they will pay to order office supplies for the coming year. Which of the following is a good initial hypothesis to test in order to help the purchaser optimize their spending? Select all that apply.Office supply prices increase seasonally.Office supply prices remain the same throughout the year.The budget for office supplies should increase.The budget for office supplies can remain the same. According to the McCandless method, when should you present the data that supports insights?After stating insightsBefore stating insightsAt the end of the presentationAt the beginning of the presentationWhile stating insights An analyst introduces a graph to their audience to explain an analysis they performed. Which strategy would allow the audience to absorb the data visualizations? Select all that apply.Practicing breathing exercisesImproving body languageUsing the five second ruleStarting with broad ideas During a presentation, one of your stakeholders expresses concern that you did not control for differences in the data. Which kind of objection are they making?FindingsPresentation SkillsDataAnalysis During a meeting, a colleague on your team points out a flaw in your analysis that you had not noticed before. What steps should you take to respond to their objection? Select all that apply.Hide evidence that you were incorrectFollow up with your colleagueInvestigate the issueAcknowledge that their objection is valid You are presenting to a large audience and want to keep everyone engaged during your Q&A. What can you do to ensure your audience doesn’t grow disinterested despite its size?Ask your audience for insightsWait longer for the audience to ask questionsRepeat your key findingsKeep your pitch level Which of the following statements is true about using a hypothesis in your data presentation?Include the hypothesis in a summary at the end of your presentationChoose a hypothesis your audience will likeInclude a new hypothesis before every data visualizationPresent the hypothesis early in your presentation Why is it important to state the insights from your graphic when using the McCandless method?To get everyone on the same page before you give supporting detailsTo make sure your audience understands why the data mattersTo ensure that you establish credibility as a serious data analystTo add a strong finish to your presentation A researcher is presenting the data for their study. What can they do to ensure their presentation is impactful?Ensure their delivery is as well executed as their analysisSuppress their excitement to remain passive and neutralStart with really narrow ideas and works towards broad ideasFocus on the data instead of focusing on presentation skills You run a colleague test on your presentation before getting in front of an audience. Your coworker asks a question about a section of your analysis, but addressing their concern would mean adding information you didn’t plan to include. How should you proceed with building your presentation? Select all that apply.Leave the presentation as-isKeep the concern in mind and anticipate that stakeholders may ask the same questionRemove the section of the analysis that prompted the questionExpand your presentation by including the information One of your stakeholders tried to reproduce the work you presented by using a copy of your scripts and was unable to get the same results. Which kind of objection are they making?DataAnalysisPresentation skillsFindings One of your co-workers is giving a presentation on the results of an analysis the two of you have been working on. Someone in the audience points out that the data system you used has frequent errors. How should you deal with this comment?Assume you were given valid dataTell them they should have looked at the appendixExplain how you cleaned and formatted the dataIgnore the question and move on Why should you repeat questions that you receive during your presentation? Select all that apply.It helps you take up more time.It gives you a moment to think.It allows you to ensure you understood the question.It ensures you focus on the person asking the question instead of the whole audience. You give a presentation on your latest data analysis and receive feedback from the audience that they did not understand the context of the analysis. What might have caused this problem? Select all that apply.Your hypothesis was stated early.Your hypothesis was not included.Your hypothesis was a disprovable theory.Your hypothesis was stated too late. According to the McCandless Method, what is the most effective way to finish presenting data to an audience?Call out data to support your insightsTell your audience why it mattersAnswer any obvious questions before they’re askedState the insight of your graphic You are putting together a list of your peers to run colleague tests with. What are some qualities of good peers to target?They are very different from your audienceThey are familiar with your previous workThey worked on the analysis with youThey have no prior knowledge of your work Your stakeholders are concerned that you inappropriately removed data during the initial phases of your project. Which kind of objection are they making?FindingsDataPresentation SkillsAnalysis You are presenting to your stakeholders an analysis of your company’s latest quarter earnings. Your stakeholders express concern that your projections for next quarter are lower than expected. What are appropriate ways to respond to these objections? Select all that apply.Explain why you think the discrepancies existRepeat the steps you tookTake steps to investigate your analysis question furtherCommunicate the assumptions you made in your approach After a presentation one of your peers points out that you were unable to answer audience questions very well. Which step can you take to improve your question answering?Answer questions immediately with highly detailed answersStart thinking of answers during the questionFocus your responses on people that ask questions instead of the whole audienceRepeat questions to ensure you understood What is the final step, or the “so what?” phase, to the McCandless Method? This is the point where you present the possible business impact of the solution and clear actions stakeholders can take?State the insight of the graphicTell your audience why it mattersCall out data to support that insightAnswer obvious questions before they’re asked During a presentation, you stop and wait for five seconds after displaying a new graphic. According to the McCandless method, what should you do after that delay?Ask if there are any questionsWait another five secondsMove on to the next topicReturn to the previous content You are giving a presentation to the leadership of a local community organization. How can you effectively communicate your findings to them?Focus on what you found interestingFocus on what the audience needs to hearFocus on specific technical details of your analysisFocus on speaking without any pauses You are on a team of analysts presenting to your stakeholders. Your teammate responds to an objection about your steps of analysis by repeating the steps and then getting defensive when the stakeholders don’t seem to understand. What could they have done to respond to the objection more appropriately? Select all that apply.Promise to investigate your analysis question furtherRemind the stakeholders of your successesAcknowledge that the objection is validDescribe the approach you took in your analysis You are getting ready to give the biggest presentation of your career. Which of the following methods might help you prepare to give the presentation? Select all that apply.Write a script and repeat it in your headHold a dress rehearsal at the presentation locationAvoid thinking about the presentationVisualize giving the presentation You are presenting to your stakeholders and want to convey confidence. How should your body language reflect your composure? Select all that apply.Stand up straight and be stillGesture enthusiastically to illustrate each pointMake eye contact with audience membersPace as you speak to the audience As part of an internship you are giving a presentation of your work to the rest of the department. Why might you want to perform a colleague test? Select all that apply.It helps you come up with highly detailed answers.It can help find places your audience might get confused.It can help you discover jargon to include.It can help you discover jargon to include. You are introducing a data visualization during your presentation and are concerned that it may overwhelm your audience. How can you allow the audience to process the information when you first introduce the visualization?Wait five secondsThoroughly explain the contextDescribe each graph quicklyDefine each parameter You are preparing to present in front of a large audience. Which of the following is a best practice for speaking to an audience?Speak as quickly as possibleTake as few pauses as possibleTake long pauses between sentencesSpeak at a relaxed pace in short sentences You are running a colleague test with your coworkers. One coworker points out that your data has limitations. What can you do to prepare to explain the limitations of your data? Select all that apply.Consider the contextCritically analyze any correlationsUnderstand the strength and weaknesses of your toolsBe ready with industry jargon and acronyms Course challenge Next, you decide on your data narrative’s characters, setting, plot, big reveal, and aha moment. The characters are the people affected by your story. This includes your stakeholders, Gaea’s customers, and Gaea’s potential future customers. For the setting, you describe the current situation, potential tasks, and background information about the analysis project.As you begin to work on the plot for the data narrative, which of the following ideas would you include? Select all that apply.Why it’s important for Gaea to increase its cars’ battery range by 2025How your data analysis can help Gaea solve its business problemsThe challenges associated with the current lack of vehicle charging stationsA list of your recommendations and details about why they will help Gaea be successful After creating data visualizations about the current state of the electric vehicle market, you turn to projections. You want to communicate to stakeholders about the importance of longer vehicle battery range to consumers.Your team analyzes data from a consumer survey that investigated the importance of longer battery range when choosing whether to purchase an electric car. The current average battery range is about 210 miles. By 2025, that distance is expected to grow to 450 miles per charge.You create the following pie chart:After reviewing your pie chart, you realize that it could be improved. How do you make this chart more effective?Write a longer title to add more detail about the data the pie chart containsRemove the labels for the number of miles per charge consumers will require before purchasing an electric vehicleResize the pie segments so they visually show the different valuesAdd an x-axis and y-axis to provide additional explanation about the data As a final step in the data-sharing process, you think about how to respond during the Q&A session. What strategies will you employ when answering questions? Select all that apply.Involve your whole audienceListen to the whole question, and repeat it, if necessaryUnderstand the context of the questionProvide detailed, comprehensive responses Scenario 1, questions 1-9You have been working as a junior data analyst at Bowling Green Business Intelligence for nearly a year. Your supervisor, Kate, tells you that she believes you are ready for more responsibility. She asks you to lead an upcoming client presentation. You will be responsible for creating the data story, identifying the right tools to use, building the slideshow, and delivering the presentation to stakeholders. Your client is Gaea, an automotive manufacturer that makes eco-friendly electric cars. For the past year, you have been working with the data team in Gaea’s Bowling Green, Kentucky, headquarters. For the presentation, you will engage the data team, as well as its regional sales representatives and distributors. Your presentation will inform their business strategy for the next three-to-five years. You begin by getting together with your team to discuss the data story you want to tell. You know the first step in data storytelling is to engage your audience. Fill in the blank: A big part of engagement is knowing how to eliminate less important details. So, you use spotlighting to _____ the data in order to identify the most important insights.recheckscanstudyresearch Scenario 1, continued After you identify the most important insights, it’s time to create your primary message. Your team’s analysis has revealed three key insights: Electric vehicle sales demand is expected to grow by more than 400% by 2025. The number of publicly available vehicle charging stations is a significant factor in consumer buying decisions. Currently, there are many locations with so few charging stations that electric car owners would run out of power when traveling between stations. Vehicle battery range is also a significant factor for consumers. In 2020, the average battery range was 210 miles. However, the vast majority of survey respondents report they will not buy an electric car until the battery range is at least 300 miles per charge. Based on these insights, you create your primary message. Which of the following reflect the expectations of a primary message?The number of publicly available vehicle charging stations is a significant factor in consumer buying decisions. Therefore, Gaea must begin building vehicle charging stationsAlthough electric vehicle sales demand is on the rise, low availability of charging stations and short battery range are significant hurdles that Gaea must overcomeElectric vehicle sales demand is expected to grow by more than 400% by 2025. However, the number of publicly available vehicle charging stations is a significant factor in consumer buying decisions. Currently, there are many locations with so few charging stations that electric car owners would run out of power when traveling between stations. Vehicle battery range is also a significant factor for consumers. In 2020, the average battery range was 210 miles. However, the vast majority of people say they will not buy an electric car until the battery range is at least 300 miles per chargeElectric vehicle demand is skyrocketing Scenario 1, continued Next, you decide on your data narrative’s characters, setting, plot, big reveal, and aha moment. During the narrative, you want to communicate to your stakeholders about the challenges associated with the current lack of vehicle charging stations and why it's important for Gaea to increase its cars’ battery range by 2025. Information about charging stations and the need to increase battery range will be part of the setting of your data story.TrueFalse Scenario 1, continued Now, it’s time to consider which tools to use to create data visualizations that will clearly communicate the results of your analysis. You and your team decide to make both spreadsheet charts and Tableau data visualizations. In addition, you want to provide them with a tool that will achieve the following goals: Organize multiple datasets about electric vehicle battery ranges into a central location Enable tracking and analysis of electric vehicle data Simplify data visualizations about the number of available charging stations using maps of the different geographies What tool do you create for your stakeholders?DashboardSpreadsheetDatabaseAlgorithm Now that you have finished planning the data story with your team, it’s time to create data visualizations. First, you consider electric vehicle sales worldwide in 2015 compared to 2020. You use a spreadsheet to create the following bar graph to compare the two values:You want to add a label to represent the scale (total count by year) of electric vehicle sales. Where on the graph do you label these values?The colorsThe vertical barsThe y-axisThe x-axis Next, you explore how access to public car-charging stations is influencing electric vehicle purchases. As your analysis has revealed, there are many areas without enough places for people to plug in and charge their cars. This lack of charging stations has a negative impact on demand for electric cars and overall vehicle sales. You use Tableau to create the following draft of a visualization, which organizes the charging station data geographically:After reviewing your draft, you realize that it could be improved. Fill in the blank: To improve your draft, you select more varied hues and make the color intensity stronger. In addition, you choose darker _____ in order to reflect more light.viewsvaluesvisualsvariables Scenario 1, continued Now, you want to highlight what your team’s analysis discovered about the number of charging stations available compared to the number of cars purchased. Your data has confirmed that the lack of charging stations causes the effect of fewer car sales. To communicate this effectively, you will need to convey causation to the stakeholders. You explain that causation is the measure of the degree to which two variables move in relationship to each other. In the case of Gaea’s business, charging station numbers and car sales move in the same direction.TrueFalse Scenario 1, continued Once you finish creating data visualizations about the current state of the electric vehicle market, you turn to projections for the future. You want to communicate to stakeholders about the importance of longer vehicle battery range to consumers. Your team’s data includes feedback from a consumer survey that investigated the importance of longer battery when choosing whether to purchase an electric car. The current average battery range is about 210 miles. By 2025, that distance is expected to grow to 450 miles per charge. You create the following pie chart:Fill in the blank: After reviewing your pie chart, you realize that it could be improved. You resize the _____ so they visually show the different values.labelsaxessegmentsvalues Scenario 1, continued It’s time to build your Tableau dashboard for stakeholders. You consider what type of layout to use.You decide that you want to be able to adjust the width of the views and the data visualizations about electric vehicle sales, charging stations, and battery range. Which type of layout will enable you to do that?Vertical layoutCircular layoutDiagonal layoutHorizontal layout Scenario 2, questions 10-15 You have created your narrative and visuals, so now it’s time to build a professional and appealing slideshow. You choose a theme that matches the tone of your presentation. Then, you create a title slide with a title, subtitle, and the date. Next, you create the following slide that compares electric vehicle sales in 2015 and 2020:After reviewing your slide, you realize that it could be improved. What steps do you take to make the two text boxes beneath the header more effective? Select all that apply.Add Your Heading Text HereEdit the text to fewer than five lines totalEnsure the text does not simply repeat the words you plan to sayUse abbreviations to reduce the amount of textEdit the text to fewer than 25 words total Scenario 2, continued You then create the following slide to demonstrate the challenges associated with battery range and charging stations:After reviewing your slide, you realize that the visual elements could be improved. A good solution would be for you to choose one data visualization to share on this slide, then create another slide for the second data visualization.TrueFalse Scenario 2, continued You complete your slideshow and share it with your team. Once it is approved by your supervisor, you begin preparing to give your presentation. You consider maintaining good posture, being aware of nervous habits, and making eye contact. In addition, you think about how you will explain the data visualizations. One of the strategies you practice is the five-second rule. What are some key aspects of this rule? Select all that apply.Ask your audience if they understand the data visualizationBe prepared to explain the data visualizationTell your audience the conclusion that you want them to understandTake no more than five seconds to explain the data visualization Scenario 2, continued Next, you prepare for the question-and-answer session that will follow your presentation. To predict what questions they may ask, you do a colleague test of your presentation. You should choose a colleague who has deep expertise in the electric vehicle industry.TrueFalse Scenario 2, continued Now that you have some idea of the questions the stakeholders will ask, you and a team member consider different objections that might arise. Your team member asks you how you will respond if someone from Gaea questions your data-cleaning process. How do you prepare for this objection? Select all that apply.Keep a detailed log of your data-cleaning processPractice answering questions about your data-cleaning processAdd your data-cleaning log to the slideshow appendixBe prepared to explain why data cleaning is not relevant at this stage of the project Scenario 2, continued The big day has arrived, and you have just finished giving your presentation to the Gaea team. It’s now time for the question-and-answer session, and a stakeholder asks you a very detailed question about one specific electric vehicle charging station initiative. You listen to the whole question, then repeat it. For what reasons is this important? Select all that apply.It ensures the entire audience has heard the question, in case they did not when it was originally askedIt enables you to rephrase it in a way that is easier to answerIt helps you confirm that you understand the questionIt gives the stakeholder a chance to correct you if you misunderstand Shuffle Q/AScenario 1, questions 1-9 You have been working as a junior data analyst at Bowling Green Business Intelligence for nearly a year. Your supervisor, Kate, tells you that she believes you are ready for more responsibility. She asks you to lead an upcoming client presentation. You will be responsible for creating the data story, identifying the right tools to use, building the slideshow, and delivering the presentation to stakeholders. Your client is Gaea, an automotive manufacturer that makes eco-friendly electric cars. For the past year, you have been working with the data team in Gaea’s Bowling Green, Kentucky, headquarters. For the presentation, you will engage the data team, as well as its regional sales representatives and distributors. Your presentation will inform their business strategy for the next three-to-five years. You begin by getting together with your team to discuss the data story you want to tell. You know the first step in data storytelling is to engage your audience. You use spotlighting to help you identify the most important insights. Which of the following activities are involved with spotlighting? Select all that apply.Determining the data’s partialityIdentifying connections or patternsFinding ideas or concepts that keep arisingNoticing repeated words or numbers Scenario 1, continued Once you have identified the most important insights, it’s time to create your primary message. Your team’s analysis has revealed three key insights: Electric vehicle sales demand is expected to grow by more than 400% by 2025. The number of publicly available vehicle charging stations is a significant factor in consumer buying decisions. Currently, there are many locations with so few charging stations that electric car owners would run out of power when traveling between stations. Vehicle battery range is also a significant factor for consumers. In 2020, the average battery range was 210 miles. However, the vast majority of survey respondents report they will not buy an electric car until the battery range is at least 300 miles per charge. Based on these insights, you create your primary message. What are the expectations of a primary message? Select all that apply.ClearDirectComprehensiveSubtle Scenario 1, continued Next, you decide on your data narrative’s characters, setting, plot, big reveal, and aha moment. During the narrative, you want to communicate to your stakeholders about the challenges associated with the current lack of vehicle charging stations and why it's important for Gaea to increase its cars’ battery range by 2025. In which part of your data narrative would you include information about charging stations, the need to increase battery range, and why it’s important for Gaea to increase its cars’ battery range?Aha momentSettingPlotBig reveal Scenario 1, continued It’s time to build your Tableau dashboard for stakeholders. You consider what type of layout to use. Describe the differences between vertical and horizontal layouts. Select all that apply.Vertical layouts prevent items from being layered over other objectsVertical layouts adjust the height of the views and objects containedHorizontal layouts adjust the width of the views and objects containedHorizontal layouts prevent items from being layered over other objects Scenario 2, questions 10-15 You have created your narrative and visuals, so now it’s time to build a professional and appealing slideshow. You choose a theme that matches the tone of your presentation. Then, you create a title slide with a title, subtitle, and the date. Next, you create the following slide to communicate information about electric vehicle sales in 2015 compared to 2020:Alt-text: Slideshow with bar chart of electric vehicle sales from 2015 and 2022. 2022 had higher sales. There are also multiple sentences at the bottom of the slide and another piece of descriptive text near the chart. To improve the slide, you remove the text box at the bottom. For what reasons will this make your slide more effective? Select all that apply.Slide text should be fewer than 25 words totalThe text shouldn’t simply repeat the words you sayThe font size is too small for your audience to readSlide text should be no more than 10 lines total Scenario 2, continued You complete your slideshow and share it with your team. Once it is approved by your supervisor, you begin preparing to give your presentation. You consider maintaining good posture, being aware of nervous habits, and making eye contact. In addition, you think about how you will speak. What strategies can help you speak effectively? Select all that apply.Building in intentional pauses to give your audience time to think about what you have just saidSpeaking quickly so you are sure to have time to include all important data pointsUsing short words and sentencesKeeping the pitch of your sentences level so that your statements are not confused for questions Scenario 2, continued Next, you prepare for the question-and-answer session that will follow your presentation. What methods help you consider any limitations of your data? Select all that apply.Understand the strengths and weaknesses of the toolsEliminate the outliersLook at the contextCritically analyze the correlations Scenario 2, continued The big day has arrived, and you finish your presentation to the Gaea team. In the question-and-answer session, a stakeholder asks you a very detailed question about a car battery range project that's still in development. What strategies do you use in order to respond effectively? Select all that apply.Be certain that you understand the context of the question that the stakeholder is askingInvolve the whole audience when you respond to the stakeholderKeep your response short and to the point, then add detail if there are follow-up questionsGive yourself extra time by planning your thoughtful response when the stakeholder begins speaking Scenario 1, continued Your team’s analysis has revealed three key insights:Electric vehicle sales demand is expected to grow by more than 400% by 2025.The number of publicly available vehicle charging stations is a significant factor in consumer buying decisions. Currently, there are many locations with so few charging stations that electric car owners would run out of power when traveling between stations.Vehicle battery range is also a significant factor for consumers. In 2020, the average battery range was 210 miles. However, the vast majority of survey respondents report they will not buy an electric car until the battery range is at least 300 miles per charge.Fill in the blank: Based on these insights, you create a clear and direct _____, which will guide your data story.business casespotlightprimary messagespecific question Scenario 1, continued Now, it’s time to consider which tools to use to create data visualizations that will clearly communicate the results of your analysis. You and your team decide to make both spreadsheet charts and Tableau data visualizations. In addition, you agree to build a dashboard to share live, incoming data with your stakeholders. This will help them achieve the following goals:Organize multiple datasets about electric vehicle battery ranges into a central locationEnable tracking and analysis of electric vehicle dataSimplify data visualizations about the number of available charging stations using maps of the different geographiesAnother key benefit of dashboards is that they enable you to maintain control of your data narrative.TrueFalse Next, you explore how access to public car-charging stations is influencing electric vehicle purchases. As your analysis has revealed, there are many areas without enough places for people to plug in and charge their cars. This lack of charging stations has a negative impact on demand for electric cars and overall vehicle sales.You use Tableau to create the following draft of a visualization, which organizes the charging station data geographically:After reviewing your draft, you realize that it could be improved. What steps do you take to make your map more effective? Select all that apply.Select more varied huesAdd more space between each stateMake the intensity of the colors strongerChoose darker values Scenario 1, continued Now, you want to highlight what your team’s analysis discovered about the number of charging stations available compared to the number of cars purchased. Your data has confirmed that the lack of charging stations causes the effect of fewer car sales. To communicate this effectively, you will need to convey causation to the stakeholders. How do you explain causation?Causation involves how often data values fall into certain ranges. In the case of Gaea’s business, data about the number of charging stations will fall into ranges associated with car sales.Causation is the measure of the degree to which two variables move in relationship to each other. In the case of Gaea’s business, charging station numbers and car sales move in the same direction.Causation involves everything associated with an event. In the case of Gaea’s business, the lack of charging stations has a negative effect on the entire automotive marketplace.Causation is when an action directly leads to an outcome, such as a cause-effect relationship. In the case of Gaea’s business, the lack of charging stations directly leads to the outcome of fewer car sales. Scenario 2, continued You then create the following slide to demonstrate the challenges associated with battery range and charging stations:After reviewing your slide, you realize that the visual elements could be improved. You do this by first choosing one data visualization to share on this slide, then create another slide for the second data visualization. Fill in the blank: In addition, you make sure to use _____ font sizes and colors for all of your data visualization titles.consistentuniquecolorfuldifferent Scenario 2, continued Now that you have some idea of the questions the stakeholders will ask, you consider potential objections. You and a team member consider different objections that might arise. Your team member asks you how you will respond if someone from Gaea has an objection that you haven’t prepared for.You say that you will respond professionally using the information you currently have available in order to move quickly past the objection.TrueFalse Scenario 1, questions 1-9 You have been working as a junior data analyst at Bowling Green Business Intelligence for nearly a year. Your supervisor, Kate, tells you that she believes you are ready for more responsibility. She asks you to lead an upcoming client presentation. You will be responsible for creating the data story, identifying the right tools to use, building the slideshow, and delivering the presentation to stakeholders.Your client is Gaea, an automotive manufacturer that makes eco-friendly electric cars. For the past year, you have been working with the data team in Gaea’s Bowling Green, Kentucky, headquarters. For the presentation, you will engage the data team, as well as its regional sales representatives and distributors. Your presentation will inform their business strategy for the next three-to-five years.You begin by getting together with your team to discuss the data story you want to tell. You know the first step in data storytelling is to engage your audience. A big part of audience engagement is knowing how to eliminate less important details. What practice do you use to scan quickly through the data in order to identify the most important insights?BalancingRankingFilteringSpotlighting Scenario 1, continued Now that you have finished planning the data story with your team, it’s time to create data visualizations. First, you consider electric vehicle sales worldwide in 2015 compared to 2020. You use a spreadsheet to create the following bar graph to compare the two values:You add information on the x-axis to represent a scale of values for the total electric vehicle sales and on the y-axis to represent the time periods (2015 and 2020).FalseTrue Scenario 2, continued You then create the following slide to demonstrate the challenges associated with battery range and charging stations:After reviewing your slide, you realize that the visual elements could be improved. Which of the following options would help you make the visual elements on this slide more effective? Select all that apply.Use more colors in the mapProvide a detailed written explanation of both data visualizationsChoose one data visualization to share on this slide, then create another slide for the second data visualizationUse a consistent font size and color for data visualization titles Scenario 2, continued You complete your slideshow and share it with your team. Once it is approved by your supervisor, you prepare to give your presentation. You consider presentation best practices: maintaining good posture, being aware of nervous habits, and making eye contact. In addition, you think about how you will present your data visualizations. What strategies can help you explain the data visualizations effectively? Select all that apply.Channel your excitementStart with the broader ideasUse the five-second ruleSpeak quickly to save time and cover all important data points Course 7 – Data Analysis with R Programming Week 1 – Programming and data analytics What are the benefits of using a programming language for data analysis? Select all that apply.It is faster to clean data.It is easy to share code.It does not require data cleaningIt does not require specific syntax. What process does a data analyst use to instruct a computer to perform sets of actions?AnalyticsProgrammingFilteringVisualization A team of data analysts is working on a complex analysis. The team needs to quickly process lots of data. They also need to easily reproduce and share every step of their analysis. What should they use for the analysis?A dashboardA databaseStructured query languageThe R programming language What is a type of application that brings together all the tools a data analyst may want to use in a single place?SpreadsheetIntegrated development environmentDatabaseDashboard Which of the following statements about RStudio’s integrated development environment are correct? Select all that apply.RStudio only works on Windows.RStudio panes are customizable.RStudio includes a built-in console.RStudio is closed-source. A data analyst writes the code summary(penguins) in order to display a summary of the penguins dataset. Where in RStudio can the analyst execute the code? Select all that apply.R console paneSource editor paneFiles tabEnvironment pane 1.A data analyst uses words and symbols to give instructions to a computer. What are the words and symbols known as?Coded languageFunction languageProgramming languagesSyntax languages Many data analysts prefer to use a programming language for which of the following reasons? Select all that apply.To save timeTo clarify the steps of an analysisTo easily reproduce and share an analysisTo choose a topic for analysis Fill in the blank: _____ code is freely available and may be modified and shared by the people who use it.Open-endedOpen-sourceOpen-accessOpen-syntax Which of the following are benefits of using R for data analysis? Select all that apply.Create high-quality data visualizationsDefine a problem and ask the right questionsProcess lots of dataReproduce and share an analysis 5.Fill in the blank: A data analyst wants to quickly create visualizations and then share them with a teammate. They can use _____ for the analysis.the R programming languagea dashboardstructured query languagea database RStudio’s integrated development environment includes which of the following? Select all that apply.A console for executing commandsAn area to manage loaded dataA viewer for playing videosAn editor for writing code Fill in the blank: When you execute code in the source editor, the code automatically also appears in the _____.R consoleplots tabenvironment panefiles tab A data analyst is working with spreadsheet data. The analyst imports the data from the spreadsheet into RStudio. Where in RStudio can the analyst find the imported data?Source editor paneEnvironment paneR console panePlots tab Shuffle Q/A Fill in the blank: _____ are the words and symbols you use to write instructions for computers.Code languagesProgramming languagesSyntax languagesVariable languages A data analyst wants to use a programming language that they can modify. What type of programming language should they use?Console-basedData-centricCommunity-orientedOpen-source A data analyst needs to quickly create a series of scatterplots to visualize a very large dataset. What should they use for the analysis?A dashboardThe R programming languageA slide presentationStructured query language What type of software is RStudio?Integrated development environmentProgramming languageSyntaxPane A data analyst wants to write R code where they can access it again after they close their current session in RStudio. Where should they write their code?R consoleFiles tabHistory tabSource editor What are the benefits of using a programming language for data analysis? Select all that apply.They store steps of your analysis for future use.They have no specific syntax.They save time cleaning data.It does not require data cleaning Which of the following statements about the R programming language are correct? Select all that apply.It can create world-class visualizationsIt makes analysts spend more time cleaning data and less time analyzingIt can process large amounts of dataIt relies on spreadsheet interfaces to clean and manipulate data A data analyst is searching for a tool that gives them the most power to customize the visualizations they use in their analysis. What tool should they use?The R Programming languageTableauSpreadsheetsSQL Which of the following statements about RStudio’s integrated development environment are correct? Select all that apply.R studio is unable to produce visualizations.R studio is built specifically for working with R.The layout of panes in R studio is fixed.R studio helps with file management. R users share custom solutions they have developed for data problems. Where can you find this information in RStudio?Packages tabHistory tabEnvironment tabR console What tool gives data analysts the highest level of control over their data analysis?SpreadsheetSQLTableauProgramming language Using a programming language can help you with which aspects of data analysis? Select all that apply.Visualize your dataAsk the right questions about your dataTransform your dataClean your data What is the term for programming code that is freely available and may be modified and shared by the people who use it?Open-sourceOpen-endedData-centricOpen-data For what reasons do many data analysts choose to use R? Select all that apply.R can quickly process lots of data.R is a data-centric programming language.R can create high quality visualizations.R is a closed source programming language. What is a benefit of using the R programming language for data analysis? Select all that apply.It is the most popular machine-learning language.It is a general-purpose programming language.It can create world-class visualizations.It can work with large amounts of data RStudio’s integrated development environment lets you perform which of the following actions? Select all that apply.Install R packagesImport data from spreadsheetsCreate data visualizationsStream online videos Fill in the blank: In RStudio, the _____ is where you can find all the data you currently have loaded, organize it, and save it.source editor paneenvironment paneR console paneplots pane Which of the following are benefits of open-source code? Select all that apply.Anyone can pay a fee for access to the code.Anyone can use the code for free.Anyone can fix bugs in the code.Anyone can create an add-on package for the code. A data analyst is searching for an open-source tool that will allow them to work with very large amounts of data. What tool is the best option?SpreadsheetJSONRTableau In RStudio, where can you find and manage all the data you currently have loaded?R console panePlots tabSource editor paneEnvironment pane What are the benefits of using a programming language for data analysis? Select all that apply.Clarify the steps of the analysisEasily reproduce and share the analysisAutomatically choose a topic for analysisEfficiently save time What attribute of the R programming language makes it an open-source programming language?The code is designed to be data-centric.The code is open to processing large amounts of data.The code is distributed by a company named “Open-Source.”The code can be modified and shared by anyone who uses it. In which two parts of RStudio can you execute code? Select all that apply.The environment paneThe source editor paneThe R console paneThe plots pane How do data analysts refer to the words and symbols they use to write instructions for computers?Programming languagesSyntax languagesCode languagesVariable languages A data analyst wants to write R code in RStudio that will go away after they close their current session. Where should they write their code?Environment tabSource editorPlots tabR console Week 2 – Programming using RStudio A data analyst inputs the following code in RStudio: print(100 / 10) What type operators does the analyst use in the code?AssignmentConditionalLogicalArithmetic Which of the following is a best practice when naming variables in R?Variable names should be verbs.Variable names should start with special characters.Use lowercase for variable names.Use a space character to separate words in variable names. 1.A data analyst is assigning a variable to a value in their company’s sales dataset for 2020. Which variable name uses the correct syntax?-sales-20202020_salessales_2020_2020sales You want to create a vector with the values 12, 23, 51, in that exact order. After specifying the variable, what R code chunk allows you to create the vector?c(12, 23, 51)v(12, 23, 51)c(51, 23, 12)v(51, 23, 12) An analyst runs code to convert string data into a date/time data type that results in the following: “2020-07-10”. Which of the following are examples of code that would lead to this return? Select all that apply.mdy(“July 10th, 2020”)ymd(20200710)myd(2020, July 10)dmy(“7-10-2020”) A data analyst inputs the following code in RStudio: change_1 <- 70 Which of the following types of operators does the analyst use in the code?AssignmentLogicalRelationalArithmetic A data analyst is deciding on naming conventions for an analysis that they are beginning in R. Which of the following rules are widely accepted stylistic conventions that the analyst should use when naming variables? Select all that apply.Use single letters, such as “x” to name all variablesUse an underscore to separate words within a variable nameBegin all variable names with an underscoreUse all lowercase letters in variable names In R, what includes reusable functions and documentation about how to use the functions?PipesCommentsPackagesVectors Packages installed in RStudio are called from CRAN. CRAN is an online archive with R packages and other R-related resources.TrueFalse A data analyst is reviewing some code and finds the following code chunk:mtcars %>%filter(carb > 1) %>%group_by(cyl) %>%What is this code chunk an example of?PipeNested functionVectorData frame Shuffle Q/A A data analyst finds the code mdy(10211020) in an R script. What is the year of the date that is created?1021102011022120 Which of the following is a best practice when naming R script files?R script file names should end in “.R”R script file names should end in “.S”R script file names should end in “.rscript”R script file names should end in “.r-script” How are base packages different from recommended packages in the R package ecosystem?Recommended packages are made by the community and base packages are not.Base packages take longer to load than recommended packages.Base packages are installed and loaded by default and recommended packages are not.Recommended packages are more professionally designed than base packages. Why would a data analyst want to use the CRAN network when working with RStudio?To add new operators to RTo install R packagesTo add pipes to RTo install drivers to RStudio A data analyst wants to take a data frame named people and filter the data where age is 10, arranged by height, and grouped by gender. Which code snippet would perform those operations in the specified order? where age is equal to 10 Which of the following are examples of variable names that can be used in R? Select all that apply.autos_5utility23_salesred1You want to create a vector with the values 43, 56, 12 in that exact order. After specifying the variable, what R code chunk lets you create the vector?c(43, 56, 12)v(12, 56, 43)v(43, 56, 12)c(12, 56, 43) An analyst comes across dates listed as strings in a dataset. For example, December 10th, 2020. To convert the strings to a date/time data type, which function should the analyst use?lubridate()datetime()now()mdy() A data analyst inputs the following code in RStudio: sales_1 <- (3500.00 * 12) Which of the following types of operators does the analyst use in the code? Select all that apply.RelationalLogicalArithmeticAssignment Which of the following files in R have names that follow widely accepted naming convention rules? Select all that apply.patient_details_1.Rtitle*123.Rp1+infoonpatients.Rpatient_data.R Which of the following are included in R packages? Select all that apply.Naming conventions for R variable namesReusable R functionsTests for checking your codeSample datasets What is the name of the popular package archive dedicated to supporting R users authentic, validated code?The CRAN archiveThe RStudio websiteThe tidyversePython A data analyst writes the following code in a script and gets an error. What is wrong with their code?penguins %>%filter(flipper_length_mm == 200) %>%group_by(species) %>%summarize(mean = mean(body_mass_g)) %>%They are using too many functions.The last line should not have a pipe operator.The first line should have a pipe operator before penguins.They are using the wrong characters for the pipe operator. Fill in the blank: When creating a variable for use in R, your variable name should begin with _____.an operatora letteran underscorea number You want to create a vector with the values 21, 12, 39, in that exact order. After specifying the variable, what R code chunk lets you create the vector?c(39, 12, 21)v(39, 12, 21)v(21, 12, 39)c(21, 12, 39) If you use the mdy() function in R to convert the string “April 10, 2019”, what will return when you run your code?“4.10.19”“4/10/2019”“2019-10-4”“2019-4-10” A data analyst wants to combine values using mathematical operations. What type of operator would they use to do this?ArithmeticConditionalLogicalAssignment Which of the following files in R have names that follow widely accepted naming convention rules? Select all that apply.p1+infoonpatients.Rpatient_data.Rpatient_details_1.Rtitle*123.R A data analyst wants to create functions, documentation, sample data sets, and code test that they can share and reuse in other projects. What should they create to help them accomplish this?A data frameA tidyverseA data typeA package A data analyst needs a system of packages that use a common design philosophy for data manipulation, exploration, and visualization. What set of packages fulfills their need?BaseCRANtidyverseRecommended Which of the following are examples of variable names that can be used in R? Select all that apply.alpha_21alpha21tidyverseRecommended What function is used to create vectors in the R programming language?v()c()vector()combine() What type of packages are automatically installed and loaded to use in R studio when you start your first programming session?Recommended packagesBase packagesCommunity packagesCRAN packages Why would you want to use pipes instead of nested functions in R? Select all that apply.Pipes make it easier to add or remove functions.Pipes make it easier to read long sequences of functions.Nested functions are no longer supported by R.Pipes allow you to combine more functions in a single sequence. Which of the following are examples of variable names that can be used in R?value(2)value-2value_2value%2 A data analyst has a dataset that contains date strings like "January 10th, 2022." What lubridate function can they use to convert these strings to dates?myd()mdy()dmy()ymd() What is the relationship between RStudio and CRAN?RStudio and CRAN are both environments where data analysts can program using R code.CRAN creates visualizations based on an analyst’s programming in RStudio.CRAN contains all of the data that RStudio users need for analysis.RStudio installs packages from CRAN that are not in Base R. A data analyst previously created a series of nested functions that carry out multiple operations on some data in R. The analyst wants to complete the same operations but make the code easier to understand for their stakeholders. Which of the following can the analyst use to accomplish this?PipeCommentArgumentVector A data analyst wants to assign the value 50 to the variable daily_dosage. Which of the following types of operators will they need to use in the code?RelationalArithmeticAssignmentAssignment A data analyst needs to find a package that offers a consistent set of functions that help them complete common data manipulation tasks like selecting and filtering. What tidyverse package provides this functionality?tidyrreadrggplot2dplyr When programming in R, what is a pipe used as an alternative for?Nested functionVariableInstalled packageVector Week 3 – Working with data in R A data scientist is trying to print a data frame but when you print the data frame to the console output produces too many rows and columns to be readable. What could they use instead of a data frame to make printing more readable?A listA structureA tibbleA vector A data analyst is working with a large data frame. It contains so many columns that they don’t all fit on the screen at once. The analyst wants a quick list of all of the column names to get a better idea of what is in their data. What function should they use?colnames()str()mutate()head() You are working with the penguins dataset. You want to use the summarize() and min() functions to find the minimum value for the variable flipper_length_mm. At this point, the following code has already been written into your script:penguins %>%drop_na() %>%group_by(species, sex) %>%Add the code chunk that lets you find the minimum value for the variable flipper_length_mm.(Note: do not type the above code into the code block editor, as it has already been inputted. Simply add a single line of code based on the prompt.)What species and sex have the lowest minimum flipper length in mm?Chinstrap malesAdelie femalesGentoo femalesGentoo males A data analyst is working with a dataset in R that has more than 50,000 observations. Why might they choose to use a tibble instead of the standard data frame? Select all that apply.Tibbles can create row namesTibbles automatically only preview the first 10 rows of dataTibbles can automatically change the names of variablesTibbles automatically only preview as many columns as fit on screen 2.A data analyst is exploring their data to get more familiar with it. They want a preview of just the first six rows to get a better idea of how the data frame is laid out. What function should they use?print()preview()head()colnames() You are working with the ToothGrowth dataset. You want to use the head() function to get a preview of the dataset. Write the code chunk that will give you this preview.What are the names of the columns in the ToothGrowth dataset?VC, supp, doselen, supp, doselen, supp, VClen, VC, dose A data analyst is working with a data frame named sales. They write the following code: sales %>% The data frame contains a column named q1_sales. What code chunk does the analyst add to change the name of the column from q1_sales to quarter1_sales ?rename(quarter1_sales = q1_sales)rename(q1_sales <- “quarter1_sales”)rename(quarter1_sales <- “q1_sales”)rename(q1_sales = quarter1_sales) A data analyst is working with the penguins data. They write the following code: penguins %>% The variable species includes three penguin species: Adelie, Chinstrap, and Gentoo. What code chunk does the analyst add to create a data frame that only includes the Gentoo species?filter(species == “Gentoo”)filter(species <- “Gentoo”)filter(Gentoo == species)filter(species == “Adelie”) You are working with the penguins dataset. You want to use the summarize() and max() functions to find the maximum value for the variable flipper_length_mm. You write the following code:penguins %>%drop_na() %>%group_by(species) %>%Add the code chunk that lets you find the maximum value for the variable flipper_length_mm. drop_na() %>%group_by(species) %>%Add the code chunk that lets you find the minimum value for the variable bill_depth_mm.What is the minimum bill depth in mm for the Chinstrap species?What is the maximum flipper length in mm for the Gentoo species?200212210231 A data analyst is working with a data frame called salary_data. They want to create a new column named total_wages that adds together data in the standard_wages and overtime_wages columns. What code chunk lets the analyst create the total_wages column?mutate(salary_data, standard_wages = total_wages + overtime_wages)mutate(salary_data, total_wages = standard_wages + overtime_wages)mutate(salary_data, total_wages = standard_wages * overtime_wages)mutate(total_wages = standard_wages + overtime_wages) A data analyst is working with a data frame named stores. It has separate columns for city (city) and state (state). The analyst wants to combine the two columns into a single column named location, with the city and state separated by a comma. What code chunk lets the analyst create the location column?unite(stores, “location”, city, state, sep=”,”)unite(stores, “location”, city, sep=”,”)unite(stores, city, state, sep=”,”)unite(stores, “location”, city, state) A data analyst writes the following code chunk to return a statistical summary of their dataset: quartet %>% group_by(set) %>% summarize(mean(x), sd(x), mean(y), sd(y), cor(x, y)) Which function will return the average value of the y column?mean(y)mean(x)cor(x, y)sd(x) A data analyst uses the bias() function to compare the actual outcome with the predicted outcome to determine if the model is biased. They get a score of 0.8. What does this mean?Bias cannot be determinedThe model is biasedBias can be determinedThe model is not biased Shuffle Q/A What is an advantage of using data frames instead of tibbles?Data frames allow you to create row namesData frames make printing easierData frames allow you to use column namesData frames store never change variable names A data analyst is examining a new dataset for the first time. They load the dataset into a data frame to learn more about it. What function(s) will allow them to review the names of all of the columns in the data frame? Select all that apply.colnames()head()str()library() You are working with the ToothGrowth dataset. You want to use the skim_without_charts() function to get a comprehensive view of the dataset. Write the code chunk that will give you this view.What is the average value of the len column?18.813.14.27.65 A data analyst is working with a data frame named cars.The analyst notices that all the column names in the data frame are capitalized. What code chunk lets the analyst change all the column names to lowercase?rename_with(tolower, cars)rename_with(cars, toupper)rename_with(toupper, cars)rename_with(cars, tolower) A data analyst is working with the penguins dataset and wants to sort the penguins by body_mass_g from least to greatest. When they run the following code the penguin body mass data is not displayed in the correct order.penguins %>% arrange(body_mass_g)head(penguins)What can the data analyst do to fix their code?Save the results of arrange() to a variable that gets passed to head()Add a minus sign in front of body_mass_g to reverse the orderCorrect the capitalization of arrange() to Arrange()Use the print() function instead of the head() function You are working with the penguins dataset. You want to use the summarize() and mean() functions to find the mean value for the variable body_mass_g. You write the following code:penguins %>%drop_na() %>%group_by(species) %>%Add the code chunk that lets you find the mean value for the variable body_mass_g.What is the mean body mass in g for the Adelie species?3733.0885092.4373706.1644207.433 A data analyst is working with a data frame called zoo_records. They want to create a new column named is_large_animal that signifies if an animal has a weight of more than 199 kilograms. What code chunk lets the analyst create the is_large_animal column?zoo_records %>% mutate(is_large_animal = weight > 199)zoo_records %>% mutate(weight > 199 = is_large_animal)zoo_records %>% mutate(is_large_animal == weight > 199)zoo_records %>% mutate(weight > 199 <- is_large_animal) A data analyst is working with a data frame named users. It has separate columns for first name (first_name) and last name (last_name). The analyst wants to combine the two columns into a single column called full_name, with the first name and last name separated by a space. What code chunk lets the analyst create the full_namecolumn?unite(users, first_name, last_name, “full_name”, sep = ” “)unite(users, “full_name”, first_name, last_name, sep = ” “)merge(users, “full_name”, first_name, last_name, sep = ” “)unite(users, “full_name”, first_name, last_name, sep = “, “) A data analyst is using statistical measures to get a better understanding of their data. What function can they use to determine how strongly related are two of the variables?mean()bias()sd()cor() A data analyst wants to find out how much the predicted outcome and the actual outcome of their data model differ. What function can they use to quickly measure this?mean()bias()cor()sd() A data analyst creates a data frame with data that has more than 50,000 observations in it. When they print their data frame, it slows down their console. To avoid this, they decide to switch to a tibble. Why would a tibble be more useful in this situation?Tibbles won’t overload the console because they automatically only print the first 10 rows of data and as many variables as will fit on the screenTibbles will automatically change the names of variables to make them shorter and easier to readTibbles only include a limited number of data itemsTibbles will automatically create row names to make the data easier to read A data analyst wants to learn more about a specific data frame. Which function will allow them to review the data types of each column in the data frame?package()colnames()library()str() You have a data frame named employees with a column named Last_NAME. What will the name of the employees column be in the results of the function rename_with(employees, tolower)?last_namelast_nAMElAST_nAMELast_NAME You are working with the penguins dataset. You want to use the summarize() and min() functions to find the minimum value for the variable bill_depth_mm. You write the following code:penguins %>%drop_na() %>%group_by(species) %>%Add the code chunk that lets you find the minimum value for the variable bill_depth_mm.What is the minimum bill depth in mm for the Chinstrap species?16.413.115.512.4 A data analyst is working with a data frame called salary_data. They want to create a new column named hourly_salary that includes data from the wages column divided by 40. What code chunk lets the analyst create the hourly_salarycolumn?mutate(salary_data, hourly_salary = wages / 40)mutate(salary_data, hourly_salary = wages * 40)mutate(hourly_salary = wages / 40)mutate(hourly_salary, salary_data = wages / 40) In R, which statistical measure demonstrates how strong the relationship is between two variables?CorrelationMaximumStandard deviationAverage A data analyst creates two different predictive models for the same dataset. They use the bias() function on both models. The first model has a bias of -40. The second model has a bias of 1. Which model is less biased?The second modelIt can’t be determined from this informationThe first model What scenarios would prevent you from being able to use a tibble?You need to create column namesYou need to store numerical dataYou need to create row namesYou need to change the data types of inputs A data analyst is working with a data frame named salary_data. They want to create a new column named wagesthat includes data from the rate column multiplied by 40. What code chunk lets the analyst create the wages column?mutate(salary_data, wages = rate * 40)mutate(salary_data, wages = rate + 40)mutate(wages = rate * 40)mutate(salary_data, rate = wages * 40) A data analyst wants to check the average difference between the actual and predicted values of a model. What single function can they use to calculate this statistic?bias()cor()sd()mean() A data analyst is considering using tibbles instead of basic data frames. What are some of the limitations of tibbles? Select all that apply.Tibbles can overload a consoleTibbles can never change the input type of the dataTibbles won’t automatically change the names of variablesTibbles won’t automatically change the names of variables A data analyst wants a high level summary of the structure of their data frame, including the column names, the number of rows and variables, and type of data within a given column. What function should they use?colnames()head()rename_with()str() You are working with the ToothGrowth dataset. You want to use the glimpse() function to get a quick summary of the dataset. Write the code chunk that will give you this summary.How many variables does the ToothGrowth dataset contain?5423 A data analyst is working with the penguins dataset in R. What code chunk will allow them to sort the penguins data by the variable bill_length_mm?arrange(penguins, bill_length_mm)arrange(bill_length_mm, penguins)arrange(=bill_length_mm)arrange(=bill_length_mm) A data analyst is working with a data frame called sales. In the data frame, a column named location represents data in the format “city, state”. The analyst wants to split the city into an individual city column and state into a new countrycolumn. What code chunk lets the analyst split the location column?separate(sales, location, into=c(“country”, “city” ), sep=”, “)separate(sales, location, into=c(“city”, “country”), sep=”, “)untie(sales, location, into=c(“city”, “country”), sep=”, “)separate(sales, location, into=c(“country”, “city” ), sep=” “) A data analyst is working with the penguins data. The variable species includes three penguin species: Adelie, Chinstrap, and Gentoo. The analyst wants to create a data frame that only includes the Adelie species. The analyst receives an error message when they run the following code:penguins %>%filter(species <- “Adelie”)How can the analyst change the second line of code to correct the error?filter(Adelie == species)filter(“Adelie”)filter(“Adelie” <- species)filter(species == “Adelie”) You are working with the penguins dataset and want to understand the year of data collection for all combinations of species, island, and sex. You write the following code:penguins %>%drop_na() %>%group_by(species) %>%summarize(min = min(year), max = max(year))When you run the code in the code box, how many different groups are returned by this code chunk?31026 You are working with the ToothGrowth dataset. You want to use the glimpse() function to get a quick summary of the dataset. Write the code chunk that will give you this summary.How many different data types are used for the column data types?23601 A data analyst is working with a data frame named customers. It has separate columns for area code (area_code) and phone number (phone_num). The analyst wants to combine the two columns into a single column called phone_number, with the area code and phone number separated by a hyphen. What code chunk lets the analyst create the phone_numbercolumn?unite(customers, “phone_number”, area_code, sep=”-”)unite(customers, “phone_number”, area_code, phone_num, sep=”-”)unite(customers, “phone_number”, area_code, phone_num)unite(customers, area_code, phone_num, sep=”-”) You are compiling an analysis of the average monthly costs for your company. What summary statistic function should you use to calculate the average?mean()max()cor()min() A data analyst is studying weather data. They write the following code chunk: bias(actual_temp, predicted_temp) What will this code chunk calculate?The average difference between the actual and predicted valuesThe maximum difference between the actual and predicted valuesThe total average of the valuesThe minimum difference between the actual and predicted values Week 4 – More about visualizations, aesthetics, and annotations A data analyst creates a scatterplot with many data points. The analyst wants to make some points on the plot more transparent than others. What aesthetic should the analyst use?AlphaFillColorShape You are working with the diamonds dataset. You create a bar chart with the following code:ggplot(data = diamonds) +geom_bar(mapping = aes(x = color, fill = cut)) +You want to use the facet_wrap() function to display subsets of your data. Add the code chunk that lets you facet your plot based on the variable cut.How many subplots does your visualization show?6435 Which of the following are benefits of using ggplot2? Select all that apply.Customize the look and feel of your plotEasily add layers to your plotCombine data manipulation and visualizationAutomatically clean data before creating a plot A data analyst creates a bar chart with the diamonds dataset. They begin with the following line of code:ggplot(data = diamonds)What symbol should the analyst put at the end of the line of code to add a layer to the plot?pipe operator (%>%)plus sign (+)equal sign (=)ampersand symbol (&) A data analyst creates a plot using the following code chunk:ggplot(data = penguins) + geom_point(mapping = aes(x = flipper_length_mm, y = body_mass_g))Which of the following represents a function in the code chunk? Select all that apply.The aes functionThe geom_point functionthe data functionThe ggplot function Fill in the blank: In ggplot2, the term mapping refers to the connection between variables and _____ .data framesgeomsfacetsaesthetics A data analyst creates a scatterplot with a lot of data points. The analyst wants to make some points on the plot more transparent than others. What aesthetic should the analyst use?ColorShapeAlphaFill You are working with the penguins dataset. You create a scatterplot with the following code:ggplot(data = penguins) +geom_point(mapping = aes(x = flipper_length_mm, y = body_mass_g))You want to highlight the different penguin species on your plot. Add a code chunk to the second line of code to map the aesthetic shape to the variable species.NOTE: the three dots (...) indicate where to add the code chunk. Which penguin species does your visualization display?Adelie, Chinstrap, GentooEmperor, Chinstrap, GentooAdelie, Chinstrap, EmperorAdelie, Gentoo, Macaroni A data analyst creates a plot with the following code chunk:ggplot(data = penguins) +geom_jitter(mapping = aes(x = flipper_length_mm, y = body_mass_g))What does the geom_jitter() function do to the points in the plot?Adds a small amount of random shapes at each point in the plotDecrease the size of each point in the plotAdds a small amount of random noise to each point in the plotAdds random colors to each point in the plot You are working with the diamonds dataset. You create a bar chart with the following code:ggplot(data = diamonds) +geom_bar(mapping = aes(x = color, fill = cut)) +You want to use the facet_wrap() function to display subsets of your data. Add the code chunk that lets you facet your plot based on the variable clarity.How many subplots does your visualization show?9687 Fill in the blank: You can use the _____ function to put a text label on your plot to call out specific data points.annotate()ggplot()facet_grid()geom_smooth() You are working with the penguins dataset. You create a scatterplot with the following lines of code:ggplot(data = penguins) +geom_point(mapping = aes(x = flipper_length_mm, y = body_mass_g)) +What code chunk do you add to the third line to save your plot as a png file with “penguins” as the file name?ggsave(“penguins”)ggsave(penguins.png)ggsave(“png.penguins”)ggsave(“penguins.png”) Shuffle Q/A In ggplot2, what symbol do you use to add layers to your plot?The pipe operator (%>%)The plus sign (+)The ampersand symbol (&)The equals sign (=) A data analyst creates a plot using the following code chunk:ggplot(data = buildings) +geom_bar(mapping = aes(x = construction_year, color = height))Which of the following represents an aesthetic attribute in the code chunk?ggplotconstruction_yearbuildingsx Which code snippet will make all of the bars in the plot have different colors based on their heights?ggplot(data = buildings) + geom_bar(mapping = aes(x = construction_year), color=height)ggplot(data = buildings) + geom_bar(mapping = aes(x = construction_year)) + color(“height”)ggplot(data = buildings) + geom_bar(mapping = aes(x = construction_year, color=height))ggplot(data = buildings) + geom_bar(mapping = aes(x = construction_year)) + color(height) What is the purpose of the facet_wrap() function?Modify the visual characteristic of a data pointModify ggplot visuals to be three-dimensionalCreate text inside a plot areaCreate subplots in a grid of two variables A data analyst uses the annotate() function to create a text label for a plot. Which attributes of the text can the analyst change by adding code to the argument of the annotate() function? Select all that apply.Change the font style of the text.Change the color of the text.Change the size of the text.Change the text into a title for the plot. Which statement about the ggsave() function is correct?ggsave() exports the last plot displayed by default.ggsave() is run from the Plots Tab in RStudio.ggsave() is the only way to export a plot.ggsave() is unable to save .png files. Which of the following statements about ggplot is true?ggplot allows analysts to create plots using a single function.ggplot is the default plotting package in base R.ggplot allows analysts to create different types of plots.ggplot is designed to make cleaning data easy. A data analyst creates a plot using the following code chunk:ggplot(data = buildings) +geom_bar(mapping = aes(x = construction_year, color = height))Which of the following represents a variable in the code chunk?construction_yearmappingdataggplot Which code snippet will make all of the bars in the plot purple?ggplot(data = buildings) + geom_bar(mapping = aes(x = construction_year, color=”purple”))ggplot(data = buildings) + geom_bar(mapping = aes(x = construction_year)) + color(“purple”)ggplot(data = buildings) + geom_bar(mapping = aes(x = construction_year, color=height))ggplot(data = buildings) + geom_bar(mapping = aes(x = construction_year), color=”purple”) A data analyst is working with the following plot and gets an error caused by a bug. What is the cause of the bug?ggplot(data = penguins) %>%geom_point(mapping = aes(x = flipper_length_mm, y = body_mass_g))The code uses a pipe instead of a plus sign.A missing closing parenthesis needs to be added.The pipe should be at the beginning of the second line.A function name needs to be capitalized. You are working with the penguins dataset. You create a scatterplot with the following code chunk:ggplot(data = penguins) +geom_point(mapping = aes(x = flipper_length_mm, y = body_mass_g))You want to highlight the different penguin species in your plot. Add a code chunk to the second line of code to map the aesthetic size to the variable bill_depth_mm.NOTE: the three dots (...) indicate where to add the code chunk. You may need to scroll in order to find the dots.Which approximate range of bill depths does your visualization display?2 – 931 – 4020 – 3114 – 20 A data analyst has a scatter plot with crowded points that make it hard to identify a trend. What geometry function can they add to their plot to clearly indicate the trend of the data?geom_alpha()geom_bar()geom_jitter()geom_smooth() A data analyst wants to add a large piece of text above the grid area that clearly defines the purpose of a plot. Which ggplot function can they use to achieve this?subtitle()title()labs()annotate() By default, what plot does the ggsave() function export?The plot define the plots.config fileThe last displayed plotThe plot defined in the Plots Tab of R StudioThe first plot displayed Which of the following tasks can you complete with ggplot2 features? Select all that apply.Customize the visual features of a plotAutomatically clean data before creating a plotAdd labels and annotations to a plotCreate many different types of plots A data analyst is working with the following plot and gets an error caused by a bug. What is the cause of the bug?ggplot(data = penguins)+ geom_point(mapping = aes(x = flipper_length_mm, y = body_mass_g))The plus should be at the end of the first line.A missing closing parenthesis needs to be added.The code uses a plus sign instead of a pipe.A function name needs to be capitalized. You are working with the penguins dataset. You create a scatterplot with the following code chunk:ggplot(data = penguins) +geom_point(mapping = aes(x = flipper_length_mm, y = body_mass_g))You want to highlight the different penguin species in your plot. Add a code chunk to the second line of code to map the aesthetic shape to the variable species. NOTE: the three dots (...) indicate where to add the code chunk. You may need to scroll in order to find the dots.Which species tends to have the longest flipper length and highest body mass?GentooMacaroniAdelieChinstrap A data analyst creates a scatterplot where the points are very crowded, which makes it hard to notice when points are stacked. What change can they make to their scatter plot to make it easier to notice the stacked data points?Change geom_point() to geom_jitter()Change ggplot() to ggplot2()Change the color of the pointsChange the shape of the points Which code snippet will make all of the bars in the plot have different colors and shapes based on their heights?ggplot(data = buildings) + geom_bar(mapping = aes(x = construction_year, color=[height, height]))ggplot(data = buildings) + geom_bar(mapping = aes(x = construction_year, color=height, shape=height))ggplot(data = buildings) + geom_bar(mapping = aes(x = construction_year, color=height), aes(shape=height))ggplot(data = buildings) + geom_bar(mapping = aes(x = construction_year)) + color(height) + shape(height) You are working with the penguins dataset. You create a scatterplot with the following code:ggplot(data = penguins) +geom_point(mapping = aes(x = flipper_length_mm, y = body_mass_g))You want to highlight the different years of data collection on your plot. Add a code chunk to the second line of code to map the aesthetic size to the variable year.NOTE: the three dots (...) indicate where to add the code chunk. You may need to scroll in order to find the dots.What years does your visualization display?2006-20102005-20092007-20092007-2011Fill in the blank: The _____ creates a scatterplot and then adds a small amount of random noise to each point in the plot to make the points easier to find.geom_jitter() functiongeom_point() functiongeom_bar() functiongeom_smooth() function A data analyst creates a plot using the following code chunk:ggplot(data = buildings) +geom_bar(mapping = aes(x = construction_year, color = height))Which of the following represents a function in the code chunk?The height functionThe x functionThe ggplot functionThe mapping function A data analyst is working with the following plot and gets an error caused by a bug. What is the cause of the bug?ggplot(data = penguins) +geom_point(mapping = aes(x = flipper_length_mm, y = body_mass_g)A missing closing parenthesis needs to be added.The plus sign should be at the beginning of the second line.The code uses a plus sign instead of a pipe.A function name needs to be capitalized. Which of the following statements best describes a facet in ggplot?Facets are the ggplot terminology for a chart axis.Facets are subplots that display data for each value of a variable.Facets are the visual characteristics of geometry objects.Facets are the text used in and around plots. Which of the following is a functionality of ggplot2?Combine data manipulation and visualizations using pipes.Filter and sort data in complex ways.Define complex visualization using a single function.Create plots using artificial intelligence. Which ggplot function is used to define the mappings of variables to visual representations of data?annotate()mapping()aes()ggplot() You are working with the penguins dataset. You create a scatterplot with the following code chunk:ggplot(data = penguins) +geom_point(mapping = aes(x = flipper_length_mm, y = body_mass_g))You want to highlight the different years of data collection on your plot. Add a code chunk to the second line of code to map the aesthetic alpha to the variable island.NOTE: the three dots (...) indicate where to add the code chunk. You may need to scroll in order to find the dots.What islands does your visualization display?Biscoe, Dream, TorgersenCebu, Borneo, TorgersenCebu, Java, HispaniolaBiscoe, Java, Buton What function creates a scatterplot and then adds a small amount of random noise to each point in the plot to make the points easier to find?The geom_smooth() functionThe geom_jitter() functionThe geom_point() functionThe geom_bar() function A data analyst wants to add text elements inside the grid area of their plot. Which ggplot function allows them to do this?annotate()labs()facet()text() You are working with the penguins dataset. You create a scatterplot with the following lines of code:ggplot(data = penguins) +geom_point(mapping = aes(x = flipper_length_mm, y = body_mass_g)) +What code chunk do you add to the third line to save your plot as a pdf file with “penguins” as the file name?ggsave(penguins.pdf)ggsave(“pdf.penguins”)ggsave(=penguins)ggsave(“penguins.pdf”) Week 5 – Documentation and reports A data analyst wants to create a shareable report of their analysis with documentation of their process and notes explaining their code to stakeholders. What tool can they use to generate this?R MarkdownFiltersCode chunksDashboards A data analyst wants to add a bulleted list to their R Markdown document. What symbol can they type to create this formatting?DelimitersHashtagsBracketsAsterisks A data analyst wants to create documentation for their cleaning process so other analysts on their team can recreate this process. What tool can help them create this shareable report?Code chunksInline codeDashboardsR Markdown A data analyst wants to export their R Markdown notebook as a text document. What are the text document formats they can use to share their R Markdown notebook? Select all that apply.NotepadWordPDFHTML A data analyst writes two hashtags next to their header. What will this do to the header font in the .rmd file?Make it biggerMake it smallerMake it centeredMake it a different color Fill in the blank: A data analyst includes _____ in their R Markdown notebook so that they can refer to it directly in their explanation of their analysis.inline codemarkdownmarkdowndocumentation What symbol can be used to add bullet points in R Markdown?AsterisksBracketsExclamation marksBackticks A data analyst adds a section of executable code to their .rmd file so users can execute it and generate the correct output. What is this section of code called?Data plotDocumentationYAMLCode chunk A data analyst is inserting a line of code directly into their .rmd file. What will they use to mark the beginning and end of the code?DelimitersAsterisksMarkdownHashtags A data analyst who works with R creates a weekly sales report by remaking their .rmd file and converting it to a report. What can they do to streamline this process?Create an R notebookKnit their .rmd fileConvert their .rmd fileCreate a template Shuffle Q/A R Markdown is a file format for making dynamic documents with R. What are the benefits of creating this kind of document? Select all that apply.Save, organize, and document codeCreate a record of your cleaning processPerform calculations for analysis more efficientlyGenerate a report with executable code chunks A data analyst wants to change their header to be one font size smaller. What should they add to their markdown syntax?BacktickExclamation markDouble spaceHashtag A data analyst wants to include a line of code directly in their .rmd file in order to explain their process more clearly. What is this code called?YAMLMarkdownDocumentedInline code Which sample correctly implements a code chunk in a .rmd file?— value <- 8 —“`{r} value <- 8 “`### value <- 8“`{!}value <- 8 “` What type of export document should you use while you are working and don’t need to worry about adding page breaks in the correct places?HTMLYAMLPDFWord What are the benefits of working with R Markdown? Select all that apply.R Markdown runs interactive code chunks.R Markdown runs R code faster.R Markdown makes it possible to use larger datasets.R Markdown allows styled text between code. (=) A data analyst wants to change the default file format that gets exported by the Knit button to .pdf. What field of the YAML header should they change to set the new default file format?exporttitleoutputauthor A data analyst is reading through an R Markdown notebook and finds the text this is important. What is the purpose of the underscore characters in this text?They add the text as an image captionThey wrap the text in a clickable linkThey style the text as boldThey style the text as italics A data analyst works with an .rmd file in RStudio and wants the ability to quickly find a code chunk using the label “analysis”. Which code example would allow the analyst to quickly access the code chunk using this label?“`{analysis r}“`analysis{r}“`{r analysis}“`{r} analysis Fill in the blank: A delimiter is a character that indicates the beginning or end of _____.a data iteman analysisa sectiona header Fill in the blank: R Markdown notebooks can be converted into HTML, PDF, and Word documents, slide presentations, and _____.dashboardstablesYAMLspreadsheets Which code snippet implements the correct syntax for writing a piece of hyperlinked text in markdown?link**www.coursera.com(link)[www.coursera.com][link](www.coursera.com )>link–www.cousera.com< What is the purpose of the Knit button in R Studio?It combines multiple .rmd files into a single file.It imports the content from a .rmd file.It creates a new .rmd file.It exports the .rmd file to another document type. A data analyst wants to make a word in their markdown stand out by making it bold. What characters should they surround the text with to achieve the bold style?Angle brackets (<>)Double asterisks (**)Double hashtag (##)Single asterisk (*) A data analyst is working in a .rmd file and comes across the text ```{r analysis}. What is the purpose of the text “analysis”?It is a label for the code chunkIt changes the way the code gets exportedIt runs the code in analysis modeIt alters the output file format of Knit Why would a data analyst create a template of their .rmd file? Select all that apply.To create an interactive notebookTo prevent other users from editing the fileTo customize the appearance of a final reportTo save time when creating the same kind of document A data analyst wants to perform an analysis and make it easy for colleagues to understand his process and update the analysis a year from now. Which tool is best to achieve this objective?Code chunksR MarkdownPDF documentWord Document A data analyst needs to create a shareable report in RStudio. They first want to change the default file format that gets exported by the Knit button to .pdf. What value should they use for the output field in the YAML header?pdf_knitpdf_documentdocument_pdfknit_pdf What does the ```{r} delimiter (three backticks followed by an r contained inside curly brackets) indicate in an R Markdown notebook?The start of YAML metadataThe end of a code chunkThe end of YAML metadataThe start of a code chunk A data analyst notices that their header is much smaller than they wanted it to be. What happened?They have too few asterisksThey have too many hashtagsThey have too few hashtagsThey have too many asterisks Fill in the blank: _____ code is code that can be inserted directly into a .rmd file.ExecutableMarkdownInlineYAML Fill in the blank: If an analyst creates the same kind of document over and over or customizes the appearance of a final report, they can use _____ to save them time.a filtera code chunkan .rmd filea template Which combination of text characters can be used to embed an image in a markdown document?![]()##<>*[]() When you Knit a file in RStudio what part of code chunks are shown by default?The delimiterThe outputThe YAMLThe code A data analyst comes across in a piece of markdown text. What effect do the angle brackets (<>) have on the inner text?They create a piece of inline codeThey create a clickable linkThey create a bullet listThey create bold text Course challenge After previewing and cleaning your data, you determine what variables are most relevant to your analysis. Your main focus is on Rating, Cocoa.Percent, and Bean.Type. You decide to use the select() function to create a new data frame with only these three variables.Assume the first part of your code is:trimmed_flavors_df <- flavors_df %>%Add the code chunk that lets you select the three variables.What bean type appears in row 6 of your tibble?BenianoForasteroCriolloTrinitario Scenario 1, questions 1-7 As part of the data science team at Gourmet Analytics, you use data analytics to advise companies in the food industry. You clean, organize, and visualize data to arrive at insights that will benefit your clients. As a member of a collaborative team, sharing your analysis with others is an important part of your job. Your current client is Chocolate and Tea, an up-and-coming chain of cafes.The eatery combines an extensive menu of fine teas with chocolate bars from around the world. Their diverse selection includes everything from plantain milk chocolate, to tangerine white chocolate, to dark chocolate with pistachio and fig. The encyclopedic list of chocolate bars is the basis of Chocolate and Tea’s brand appeal. Chocolate bar sales are the main driver of revenue. Chocolate and Tea aims to serve chocolate bars that are highly rated by professional critics. They also continually adjust the menu to make sure it reflects the global diversity of chocolate production. The management team regularly updates the chocolate bar list in order to align with the latest ratings and to ensure that the list contains bars from a variety of countries. They’ve asked you to collect and analyze data on the latest chocolate ratings. In particular, they’d like to know which countries produce the highest-rated bars of super dark chocolate (a high percentage of cocoa). This data will help them create their next chocolate bar menu. Your team has received a dataset that features the latest ratings for thousands of chocolates from around the world. Click here to access the dataset. Given the data and the nature of the work you will do for your client, your team agrees to use R for this project. You create a short document about the benefits of using R for the project and share the document with your team. You write that the benefits include R’s ability to quickly process lots of data and easily reproduce and share an analysis. What is another benefit of using R for the project?Automatically clean dataDefine a problem and ask the right questionsCreate high-quality visualizationsChoose a topic for analysis Scenario 1, continued Before you begin working with your data, you need to import it and save it as a data frame. To get started, you open your RStudio workspace and load all the necessary libraries and packages. You upload a .csv file containing the data to RStudio and store it in a project folder named flavors_of_cacao.csv. You use the read_csv() function to import the data from the .csv file. Assume that the name of the data frame is flavors_df and the .csv file is in the working directory. What code chunk lets you create the data frame?read_csv(“flavors_of_cacao.csv”) <- flavors_dfflavors_df <- read_csv(“flavors_of_cacao.csv”)flavors_df + read_csv(“flavors_of_cacao.csv”)read_csv(flavors_df <- “flavors_of_cacao.csv”) Scenario 1, continued Now that you’ve created a data frame, you want to find out more about how the data is organized. The data frame has hundreds of rows and lots of columns. Assume the name of your data frame is flavors_df. What code chunk lets you review the column names in the data frame?col(flavors_df)rename(flavors_df)colnames(flavors_df)arrange(flavors_df) Scenario 1, continued Next, you begin to clean your data. When you check out the column headings in your data frame you notice that the first column is named Company...Maker.if.known. (Note: The period after known is part of the variable name.) For the sake of clarity and consistency, you decide to rename this column Maker (without a period at the end). Assume the first part of your code chunk is: flavors_df %>% What code chunk do you add to change the column name?rename(Maker %<% Company…Maker.if.known.)rename(Company…Maker.if.known %<% Maker)rename(Maker = Company…Maker.if.known.)rename(Company…Maker.if.known. = Maker) After previewing and cleaning your data, you determine what variables are most relevant to your analysis. Your main focus is on Rating, Cocoa.Percent, and Company. You decide to use the select() function to create a new data frame with only these three variables. Assume the first part of your code is: trimmed_flavors_df <- flavors_df %>% Add the code chunk that lets you select the three variables.VideriA. MorinSomaRogue Next, you select the basic statistics that can help your team better understand the ratings system in your data. Assume the first part of your code is: trimmed_flavors_df %>% You want to use the summarize() and max() functions to find the maximum rating for your data. Add the code chunk that lets you find the maximum value for the variable Rating.What is the maximum rating?4.5565.5 7.After completing your analysis of the rating system, you determine that any rating greater than or equal to 3.5 points can be considered a high rating. You also know that Chocolate and Tea considers a bar to be super dark chocolate if the bar's cocoa percent is greater than or equal to 70%. You decide to create a new data frame to find out which chocolate bars meet these two conditions. Assume the first part of your code is: best_trimmed_flavors_df <- trimmed_flavors_df %>% You want to apply the filter() function to the variables Cocoa.Percent and Rating. Add the code chunk that lets you filter the data frame for chocolate bars that contain at least 70% cocoa and have a rating of at least 3.5 points.4.004.253.753.50 Now that you’ve cleaned and organized your data, you’re ready to create some useful data visualizations. Your team assigns you the task of creating a series of visualizations based on requests from the Chocolate and Tea management team. You decide to use ggplot2 to create your visuals. Assume your first line of code is: ggplot(data = best_trimmed_flavors_df) + You want to use the geom_bar() function to create a bar chart. Add the code chunk that lets you create a bar chart with the variable Rating on the x-axis.2563 Your bar chart reveals the locations that produce the highest rated chocolate bars. To get a better idea of the specific rating for each location, you’d like to highlight each bar.Assume that you are working with the following code:ggplot(data = best_trimmed_flavors_df) +geom_bar(mapping = aes(x = Company.Location))Add a code chunk to the second line of code to map the aesthetic fill to the variable Rating.NOTE: the three dots (...) indicate where to add the code chunk.According to your bar chart, which two company locations produce the highest rated chocolate bars?Canada and FranceScotland and U.S.AScotland and CanadaAmsterdam and France Scenario 2, continued A teammate creates a new plot based on the chocolate bar data. The teammate asks you to make some revisions to their code.Assume your teammate shares the following code chunk:ggplot(data = best_trimmed_flavors_df) +geom_bar(mapping = aes(x = Cocoa.Percent)) +What code chunk do you add to the third line to create wrap around facets of the variable Cocoa.Percent?facet_wrap(Cocoa.Percent~)facet_wrap(~Cocoa.Percent)facet(=Cocoa.Percent)facet_wrap(%>%Cocoa.Percent) Scenario 2, continued Your team has created some basic visualizations to explore different aspects of the chocolate bar data. You’ve volunteered to add titles to the plots. You begin with a scatterplot.Assume the first part of your code chunk is:ggplot(data = trimmed_flavors_df) +geom_point(mapping = aes(x = Cocoa.Percent, y = Rating)) +What code chunk do you add to the third line to add the title Recommended Bars to your plot?labs(title = “Recommended Bars”)labs(title = Recommended Bars)labs(“Recommended Bars”)labs(title + “Recommended Bars”) Scenario 2, continued Next, you create a new scatterplot to explore the relationship between different variables. You want to save your plot so you can access it later on. You know that the ggsave() function defaults to saving the last plot that you displayed in RStudio, so you’re ready to write the code to save your scatterplot.Assume your first two lines of code are:ggplot(data = trimmed_flavors_df) +geom_point(mapping = aes(x = Cocoa.Percent, y = Rating)) +What code chunk do you add to the third line to save your plot as a jpeg file with chocolate as the file name?ggsave(“chocolate.jpeg”)ggsave(“chocolate.png”)ggsave(“jpeg.chocolate”)ggsave(chocolate.jpeg) Scenario 2, continued As a final step in the analysis process, you create a report to document and share your work. Before you share your work with the management team at Chocolate and Tea, you are going to meet with your team and get feedback. Your team wants the documentation to include all your code and display all your visualizations.You decide to create an R Markdown notebook to document your work. What are your reasons for choosing an R Markdown notebook? Select all that apply.It lets you record and share every step of your analysisIt allows users to run your codeIt automatically creates a website to show your workIt displays your data visualizations Shuffle Q/AScenario 1, questions 1-7 As part of the data science team at Gourmet Analytics, you use data analytics to advise companies in the food industry. You clean, organize, and visualize data to arrive at insights that will benefit your clients. As a member of a collaborative team, sharing your analysis with others is an important part of your job. Your current client is Chocolate and Tea, an up-and-coming chain of cafes.The eatery combines an extensive menu of fine teas with chocolate bars from around the world. Their diverse selection includes everything from plantain milk chocolate, to tangerine white chocolate, to dark chocolate with pistachio and fig. The encyclopedic list of chocolate bars is the basis of Chocolate and Tea’s brand appeal. Chocolate bar sales are the main driver of revenue.Chocolate and Tea aims to serve chocolate bars that are highly rated by professional critics. They also continually adjust the menu to make sure it reflects the global diversity of chocolate production. The management team regularly updates the chocolate bar list in order to align with the latest ratings and to ensure that the list contains bars from a variety of countries.They’ve asked you to collect and analyze data on the latest chocolate ratings. In particular, they’d like to know which countries produce the highest-rated bars of super dark chocolate (a high percentage of cocoa). This data will help them create their next chocolate bar menu.Your team has received a dataset that features the latest ratings for thousands of chocolates from around the world. Click here to access the dataset. Given the data and the nature of the work you will do for your client, your team agrees to use R for this project.Your supervisor asks you to write a short summary of the benefits of using R for the project. Which of the following benefits would you include in your summary? Select all that apply.Quickly process lots of dataCreate high-quality data visualizationsDefine a problem and ask the right questionsEasily reproduce and share the analysis Scenario 1, continued Before you begin working with your data, you need to import it and save it as a data frame. To get started, you open your RStudio workspace and load the tidyverse library. You upload a .csv file containing the data to RStudio and store it in a project folder named flavors_of_cacao.csv.You use the read_csv() function to import the data from the .csv file. Assume that the name of the data frame is bars_df and the .csv file is in the working directory. What code chunk lets you create the data frame?bars_df %>% read_csv(“flavors_of_cacao.csv”)read_csv(“flavors_of_cacao.csv”) + bars_dfbars_df <- read_csv(“flavors_of_cacao.csv”)bars_df + read_csv(“flavors_of_cacao.csv”) Scenario 1, continued Now that you’ve created a data frame, you want to find out more about how the data is organized. The data frame has hundreds of rows and lots of columns.Assume the name of your data frame is flavors_df. What code chunk lets you review the structure of the data frame?filter(flavors_df)str(flavors_df)select(flavors_df)summarize(flavors_df) Scenario 1, continued Next, you begin to clean your data. When you check out the column headings in your data frame you notice that the first column is named Company...Maker.if.known. (Note: The period after known is part of the variable name.) For the sake of clarity and consistency, you decide to rename this column Brand (without a period at the end).Assume the first part of your code chunk is:flavors_df %>%What code chunk do you add to change the column name?rename(Brand = Company…Maker.if.known.)rename(Company…Maker.if.known. = Brand)rename(Company…Maker.if.known. , Brand)rename(Brand, Company…Maker.if.known.) After previewing and cleaning your data, you determine what variables are most relevant to your analysis. Your main focus is on Rating, Cocoa.Percent, and Company.Location. You decide to use the select() function to create a new data frame with only these three variables.Assume the first part of your code is:trimmed_flavors_df <- flavors_df %>%Add the code chunk that lets you select the three variables.What company location appears in row 1 of your tibble?ScotlandCanadaColombiaFrance After completing your analysis of the rating system, you determine that any rating greater than or equal to 3.75 points can be considered a high rating. You also know that Chocolate and Tea considers a bar to be super dark chocolate if the bar's cocoa percentage is greater than or equal to 80%. You decide to create a new data frame to find out which chocolate bars meet these two conditions.Assume the first part of your code is:best_trimmed_flavors_df <- trimmed_flavors_df %>%You want to apply the filter() function to the variables Cocoa.Percent and Rating. Add the code chunk that lets you filter the new data frame for chocolate bars that contain at least 80% cocoa and have a rating of at least 3.75 points.How many rows does your tibble include?2220128 Scenario 2, continued A teammate creates a new plot based on the chocolate bar data. The teammate asks you to make some revisions to their code.Assume your teammate shares the following code chunk:ggplot(data = best_trimmed_flavors_df) +geom_bar(mapping = aes(x = Company)) +What code chunk do you add to the third line to create wrap around facets of the variable Company?facet(Company)facet_wrap(+Company)facet_wrap(~Company)facet_wrap(=Company) Scenario 2, continued Your team has created some basic visualizations to explore different aspects of the chocolate bar data. You’ve volunteered to add titles to the plots. You begin with a scatterplot.Assume the first part of your code chunk is:ggplot(data = trimmed_flavors_df) +geom_point(mapping = aes(x = Cocoa.Percent, y = Rating)) +What code chunk do you add to the third line to add the title Suggested Chocolate to your plot?labs(title = “Suggested Chocolate”)labs(Suggested Chocolate = title)labs(Suggested Chocolate)labs <- “Suggested Chocolate” Scenario 2, continued Next, you create a new scatterplot to explore the relationship between different variables. You want to save your plot so you can access it later on. You know that the ggsave() function defaults to saving the last plot that you displayed in RStudio, so you’re ready to write the code to save your scatterplot.Assume your first two lines of code are:ggplot(data = trimmed_flavors_df) +geom_point(mapping = aes(x = Cocoa.Percent, y = Rating)) +What code chunk do you add to the third line to save your plot as a pdf file with “chocolate” as the file name?ggsave(“chocolate.png”)ggsave(“chocolate.pdf”)ggsave(“pdf.chocolate”)ggsave(chocolate.pdf) Scenario 2, continued As a final step in the analysis process, you create a report to document and share your work. Before you share your work with the management team at Chocolate and Tea, you are going to meet with your team and get feedback. Your team wants the documentation to include all your code and display all your visualizations.Fill in the blank: You want to record and share every step of your analysis, let teammates run your code, and display your visualizations. You decide to create _____ to document your work.a databasea spreadsheetan R Markdown notebooka data frame Scenario 1, continued Before you begin working with your data, you need to import it and save it as a data frame. To get started, you open your RStudio workspace and load the tidyverse library. You upload a .csv file containing the data to RStudio and store it in a project folder named flavors_of_cacao.csv.You use the read_csv() function to import the data from the .csv file. Assume that the name of the data frame is chocolate_df and the .csv file is in the working directory. What code chunk lets you create the data frame?read_csv(“flavors_of_cacao.csv”) + chocolate_dfchocolate_df <- “flavors_of_cacao.csv”(read_csv)chocolate_df <-read_csv(“flavors_of_cacao.csv”)chocolate_df + read_csv(“flavors_of_cacao.csv”) Save, organize, and document codeCreate a record of your cleaning processPerform calculations for analysis more efficientlyGenerate a report with executable code chunks Scenario 1, continued Next, you begin to clean your data. When you check out the column headings in your data frame you notice that the first column is named Company...Maker.if.known. (Note: The period after known is part of the variable name.) For the sake of clarity and consistency, you decide to rename this column Company (without a period at the end).Assume the first part of your code chunk is:flavors_df %>%What code chunk do you add to change the column name? rename(Company = Company…Maker.if.known.)rename(Company…Maker.if.known. <- Company)rename(Company…Maker.if.known. = Company)rename(Company <- Company…Maker.if.known.) Save, organize, and document codeCreate a record of your cleaning processPerform calculations for analysis more efficientlyGenerate a report with executable code chunks Scenario 2, continued As a final step in the analysis process, you create a report to document and share your work. Before you share your work with the management team at Chocolate and Tea, you are going to meet with your team and get feedback. Your team wants the documentation to include all your code and display all your visualizations.You want to record and share every step of your analysis, let teammates run your code, and display your visualizations. What do you use to document your work?A databaseA spreadsheetA data frameAn R Markdown notebook Next, you select the basic statistics that can help your team better understand the ratings system in your data.Assume the first part of your code is: trimmed_flavors_df %>%You want to use the summarize() and mean() functions to find the mean rating for your data. Add the code chunk that lets you find the mean value for the variable Rating.What is the mean rating?3.9954453.1859334.7013374.230765 Now that you’ve cleaned and organized your data, you’re ready to create some useful data visualizations. Your team assigns you the task of creating a series of visualizations based on requests from the Chocolate and Tea management team. You decide to use ggplot2 to create your visuals.Assume your first line of code is:ggplot(data = best_trimmed_flavors_df) +You want to use the geom_bar() function to create a bar chart. Add the code chunk that lets you create a bar chart with the variable Company on the x-axis.How many bars does your bar chart display?64810 Course 8 - Google Data Analytics Capstone: Complete a Case Study Week 1 – Learn about capstone basics Test your knowledge on professional case studiesportfoliocapstonepersonal websiteproblem statement Which of the following are important strategies when completing a case study? Select all that apply.Communicate the assumptions you made about the dataUse a programming languageDocument the steps you’ve taken to reach your conclusionAnswer the question being asked To successfully complete a case study, your answer to the question the case study asks has to be perfect.TrueFalse Which of the following are qualities of the best portfolios for a junior data analyst? Select all that apply.PersonalUniqueLargeSimple Which of the following are places where you can store and share your portfolio? Select all that apply.TableauRStudioGitHubKaggle Shuffle Q/A Fill in the blank: A _____ is a collection of case studies that you can share with potential employers.portfoliocapstoneproblem statementpersonal website Week 3 – Optional: Using your portfolio An elevator pitch gives potential employers a quick, high-level understanding of your professional experience. What are the key considerations when creating an elevator pitch? Select all that apply.Focus on your process over the resultsConsider your audience’s interestsKeep it fresh by not over-practicing itMake sure it’s short enough that it can be explained to someone during an elevator ride What are the key purposes of discussing a case study during an interview? Select all that apply.Outline your thinking about a data analytics scenario for your interviewerAsk your potential employer questions about the companyNegotiate a fair salary for the positionRecommend real-world solutions based on your own work If an interviewer says, “Tell me about yourself,” it’s important to limit your response to topics related to data analytics.TrueFalse During an interview, you will likely respond to technical questions, practical knowledge questions, and questions about your personal experiences. What strategies can help you prepare to respond effectively? Select all that apply.Copy real-world examples from more experienced professionals to include in your responsesWrite down your answers to common questionsPractice your responses until they feel natural and unrehearsedBrainstorm examples from your own experiences that support your answers Imagine that an interviewer asks, “How do you maintain data integrity?” What topics does this question give you the opportunity to discuss? Select all that apply.The reasons you strongly preference SQL over spreadsheets for data cleaningThe impact that issues with your data can have on business decisionsThe methods you would use for error checking and data validationThe importance of reliability and accuracy in good data analysis Week 4 Did you complete a case study? We hope you were excited about the opportunity to complete an optional case study in this course. It's a great way to showcase your new data analytics skills to potential employers.Please let us know whether or not you completed a case study; you’ll be able to proceed with the course either way!Yes, I completed a case study.No, I skipped the case study.More certification answers our continuously expanding library in CertificationAnswers.com

6. [PDF] Georgia's Open Meetings and Open Records Laws - ACCG

  • What if the board of commissioners only has ... by-case basis, may include those individuals whose presence is consistent with the exception to the open meetings ...

7. [PDF] DEPARTMENT OF PUBLIC HEALTH 105 CMR 700.000 - Mass.gov

  • Feb 3, 2023 · ... are employed has verified with the appropriate Board of Registration, if applicable, that the person is permitted to dispense controlled ...

8. Beneficial Ownership Information Reporting Requirements

  • Sep 30, 2022 · FinCEN is issuing a final rule requiring certain entities to file with FinCEN reports that identify two categories of individuals: the ...

  • FinCEN is issuing a final rule requiring certain entities to file with FinCEN reports that identify two categories of individuals: the beneficial owners of the entity, and individuals who have filed an application with specified governmental authorities to create the entity or register it to do...

9. Procedural Due Process Civil :: Fourteenth Amendment - Justia Law

  • ... persons and only those persons that it was the purpose of the legislature to reach. The doctrine in effect afforded the Court the opportunity to choose ...

  • : Analysis and Interpretation of the of the US Constitution

10. [PDF] BOARD OF TRUSTEES REGULAR MEETING - Alameda Health System

  • Sep 13, 2023 · list was only for Board meetings and did not include the various other meetings and conferences Trustees attend. Trustee Fox asked if the ...

11. [PDF] Improving State Voter Registration Databases

  • NOTICE: The project that is the subject of this report was approved by the Governing Board of the National Research Council, whose members are ...

Top Articles
Latest Posts
Article information

Author: Duncan Muller

Last Updated: 12/01/2023

Views: 6183

Rating: 4.9 / 5 (79 voted)

Reviews: 94% of readers found this page helpful

Author information

Name: Duncan Muller

Birthday: 1997-01-13

Address: Apt. 505 914 Phillip Crossroad, O'Konborough, NV 62411

Phone: +8555305800947

Job: Construction Agent

Hobby: Shopping, Table tennis, Snowboarding, Rafting, Motor sports, Homebrewing, Taxidermy

Introduction: My name is Duncan Muller, I am a enchanting, good, gentle, modern, tasty, nice, elegant person who loves writing and wants to share my knowledge and understanding with you.