Introduction to Data analytics and life cycle – M1-DAV

Introduction to Data analytics and life cycle , module 1 of dav notes

Data Analytics Lifecycle Overview

The data analytic lifecycle is designed for addressing complex data problems and data science projects. It comprises six phases, allowing project work to occur simultaneously across multiple phases. Importantly, the cycle is iterative, reflecting the reality of project dynamics. Work can regress to earlier phases as new information is uncovered.

Key Roles for a Successful Analytics Project

  1. Business User: Holds domain expertise, understanding the intricacies of the subject area.
  2. Project Sponsor: Provides project requirements, guiding the overall direction.
  3. Project Manager: Ensures project objectives are met, overseeing the execution.
  4. Business Intelligence Analyst: Leverages deep understanding of data to provide valuable business insights.
  5. Database Administrator (DBA): Establishes and maintains the database environment.
  6. Data Engineer: Possesses technical skills crucial for data management, extraction, and supporting analytic sandbox environments.
  7. Data Scientist: Utilizes analytic techniques and modeling to derive insights and solutions.

Types of Analytics

  1. Descriptive:
    • Involves techniques for reviewing and examining datasets to understand data patterns and analyze business performance.
  2. Diagnostic:
    • Utilizes techniques to determine what events have occurred and why they happened, delving into causality.
  3. Predictive:
    • Analyzes current and historical data to forecast future outcomes, identifying likely scenarios based on existing trends.
  4. Prescriptive:
    • Involves computational techniques for developing and analyzing alternative courses of action, both tactical and strategic, to address specific scenarios or goals, potentially uncovering unexpected solutions.

Background and Overview of Data Analytics Lifecycle

The data analytics lifecycle typically consists of several phases, each with its own set of activities and objectives. :

Discovery:

  • Objective: Understand the business domain, frame the problem, identify key stakeholders, and develop initial hypotheses.
  • Activities: Interviewing analytics sponsors, gathering requirements, identifying potential data sources.
  • Outcome: Clear understanding of the business problem and initial hypotheses to be tested.

Data Preparation:

  • Objective: Prepare the data for analysis.
  • Activities: Setting up the analytic sandbox environment, performing Extract, Transform, Load, and Transform (ETLT) processes, data conditioning, surveying and visualizing the data.
  • Common Tools: ETL tools, data visualization tools, scripting languages (Python, R), SQL.
  • Outcome: Clean, formatted data ready for analysis.

Model Planning:

  • Objective: Explore the data, select relevant variables, and plan the modeling approach.
  • Activities: Exploratory data analysis (EDA), variable selection techniques, selecting appropriate modeling techniques.
  • Common Tools: Statistical software (R, Python libraries like Pandas), data mining tools (Weka, KNIME), visualization tools.
  • Outcome: Plan for the development of predictive or descriptive models.

Model Building:

  • Objective: Build and train the selected models.
  • Activities: Implementing chosen modeling techniques, fine-tuning model parameters, cross-validation, performance evaluation.
  • Common Tools: Machine learning libraries (scikit-learn, TensorFlow, PyTorch), statistical software, programming languages.
  • Outcome: Trained models ready for evaluation.

Communicate Results:

  • Objective: Present findings and insights to stakeholders.
  • Activities: Creating reports, visualizations, and presentations, explaining the results and their implications.
  • Outcome: Clear understanding of the analysis results by stakeholders, actionable insights identified.

Operationalize:

  • Objective: Integrate the models into operational systems for real-world use.
  • Activities: Deployment of models into production environments, monitoring model performance, updating models as needed.
  • Outcome: Models integrated into business processes, delivering value on an ongoing basis.

AI – Summary (Hinglish )

Data Analytics Lifecycle ki baat karein toh yeh ek complex data problems aur data science projects ko address karne ke liye design ki gayi hai. Yeh chhe phases se bana hua hai, jo project work ko ek saath kai phases mein hone ki anumati deta hai. Yeh cycle iterative hai, matlab project dynamics ka sachai ko reflect karta hai. Work pehle ke phases mein bhi ja sakta hai jab naya information milta hai.

Ek safal Analytics Project ke liye kuch key roles hai:

  1. Business User: Jo domain expertise rakhte hain, subject area ke intricacies ko samajhte hain.
  2. Project Sponsor: Jo project requirements provide karte hain aur overall direction guide karte hain.
  3. Project Manager: Jo project objectives ko meet karte hain, execution ka oversight karte hain.
  4. Business Intelligence Analyst: Jo data ka deep understanding rakhte hain aur valuable business insights provide karte hain.
  5. Database Administrator (DBA): Jo database environment establish karte hain aur maintain karte hain.
  6. Data Engineer: Jo data management, extraction, aur analytic sandbox environments support karne ke crucial technical skills rakhte hain.
  7. Data Scientist: Jo analytic techniques aur modeling ka use karke insights aur solutions derive karte hain.

Analytics ke kuch types hain:

  1. Descriptive: Data patterns aur business performance analyze karne ke techniques.
  2. Diagnostic: Events ka pata lagana aur kyun hua, causality ko samajhna.
  3. Predictive: Future outcomes forecast karne ke liye current aur historical data analyze karna.
  4. Prescriptive: Specific scenarios ya goals ko address karne ke liye alternative courses of action develop karna, unexpected solutions ko uncover karna.

Data Analytics Lifecycle ke background aur overview mein kuch phases hote hain:

  1. Discovery: Business domain ko samajhna, problem ko frame karna, key stakeholders ko identify karna, aur initial hypotheses develop karna.
  2. Data Preparation: Data ko analysis ke liye prepare karna.
  3. Model Planning: Data explore karna, relevant variables ko select karna, aur modeling approach plan karna.
  4. Model Building: Chune gaye models ko build aur train karna.
  5. Communicate Results: Findings aur insights ko stakeholders ko present karna.
  6. Operationalize: Models ko operational systems mein integrate karna real-world use ke liye.

Important Questions :

Note : topics hi kam hai to wo pura hi padh lo

  1. Draw a diagram of Data analytics life cycle. Explain any one phase in detail. Name the
    tools used for the data preparation phase.
  2. What is an analytic sandbox, and why is it important?
  3. What kinds of tools would be used in the following phases, and for which kinds of use scenarios?
    a. Phase 2: Data preparation
    b. Phase 4: Model building
  4. Explain the phases of Data analytics of life cycle in detail.
Team
Team

This account on Doubtly.in is managed by the core team of Doubtly.

Articles: 305