Mining Massive Datasets

We introduce the participant to modern distributed file systems and MapReduce, including what distinguishes good MapReduce algorithms from good algorithms in general. The rest of the course is devoted to algorithms for extracting models and information from large datasets. Participants will learn how Google's PageRank algorithm models importance of Web pages and some of the many extensions that have been used for a variety of purposes. We'll cover locality-sensitive hashing, a bit of magic that allows you to find similar items in a set of items so large you cannot possibly compare each pair. When data is stored as a very large, sparse matrix, dimensionality reduction is often a good way to model the data, but standard approaches do not scale well; we'll talk about efficient approaches. Many other large-scale algorithms are covered as well, as outlined in the course syllabus.

Saiba mais.

Este curso está disponível para assinantes! Conheça os Planos.

Certificado digital ou impresso disponível para compra avulsa, após conclusão

Curso completo
com certificado!

Ao concluir este curso, você poderá adquirir um certificado digital avulso para download ao custo de $ 5,29

Aumente as suas chances de ser contratado qualificando suas habilidades com certificados.

Conteúdo programático:

Aula #1 - 1. Distributed File Systems | Stanford University
Aula #2 - 2. The MapReduce Computational Model | Stanford University
Aula #3 - 3. Scheduling and Data Flow | Stanford University
Aula #4 - 4. Combiners and Partition Functions (Advanced) | Stanford University
Aula #5 - 5. Link Analysis and PageRank | Stanford University
Aula #6 - 6. PageRank The Flow Formulation | Stanford University
Aula #7 - 7. PageRank The Matrix Formulation | Stanford University
Aula #8 - 8. PageRank Power Iteration | Stanford University
Aula #9 - PageRank - The Google Formulation | Stanford University
Aula #10 - Why Teleports Solve the Problem | Stanford University
Aula #11 - How we Really Compute PageRank | Stanford University
Aula #12 - Finding Similar Sets | Stanford University
Aula #13 - Minhashing | Mining of Massive Datasets | Stanford University
Aula #14 - Locality Sensitive Hashing | Stanford University
Aula #15 - Applications of LSH | Stanford University
Aula #16 - Fingerprint Matching | Stanford University
Aula #17 - Finding Duplicate News Articles | Stanford University
Aula #18 - Distance Measures | Mining of Massive Datasets | Stanford University
Aula #19 - Nearest Neighbor Learning | Stanford University
Aula #20 - Frequent Itemsets | Mining of Massive Datasets | Stanford University
Aula #21 - A Priori Algorithm | Mining of Massive Datasets | Stanford University
Aula #22 - Improvements to A Priori (Advanced) | Stanford University
Aula #23 - All or Most Frequent Itemsets in 2 Passes (Advanced) | Stanford
Aula #24 - Community Detection in Graphs - Motivation | Stanford University
Aula #25 - The Affiliation Graph Model | Stanford University
Aula #26 - From AGM to BIGCLAM | Stanford University
Aula #27 - Solving the BIGCLAM | Mining of Massive Datasets | Stanford University
Aula #28 - Detecting Communities as Clusters (Advanced) | Stanford University
Aula #29 - What Makes a Good Cluster (Advanced) | Stanford University
Aula #30 - The Graph Laplacian Matrix (Advanced) | Stanford University
Aula #31 - Examples of Eigendecompositions of Graphs (Advanced) | Stanford
Aula #32 - Defining the Graph Laplacian (Advanced) | Stanford University
Aula #33 - Spectral Graph Partitioning Finding a Partition (Advanced) | Stanford
Aula #34 - Spectral Clustering Three Steps (Advanced) | Stanford University
Aula #35 - Analysis of Large Graphs Trawling (Advanced) | Stanford University
Aula #36 - Mining Data Streams | Mining of Massive Datasets | Stanford University
Aula #37 - Counting 1 's (Advanced) | Mining of Massive Datasets | Stanford University
Aula #38 - Bloom Filters | Mining of Massive Datasets | Stanford University
Aula #39 - Sampling a Stream | Mining of Massive Datasets | Stanford University
Aula #40 - Counting Distinct Elements (Advanced) | Stanford University
Aula #41 - Overview of Recommender Systems | Stanford University
Aula #42 - Content Based Recommendations | Stanford University
Aula #43 - Collaborative Filtering | Stanford University
Aula #44 - Implementing Collaborative Filtering (Advanced) | Stanford University
Aula #45 - Evaluating Recommender Systems | Stanford University
Aula #46 - Dimensionality Reduction - Introduction | Stanford University
Aula #47 - Singular Value Decomposition | Stanford University
Aula #48 - Dimensionality Reduction with SVD | Stanford University
Aula #49 - SVD Gives the Best Low Rank Approximation (Advanced) | Stanford
Aula #50 - SVD Example and Conclusion | Stanford University
Aula #51 - CUR Decomposition (Advanced) | Stanford University
Aula #52 - The CUR Algorithm (Advanced) | Stanford University
Aula #53 - Discussion of the CUR Method | Stanford University
Aula #54 - Latent Factor Models | Stanford University
Aula #55 - Latent Factor Recommender System | Stanford University
Aula #56 - Finding the Latent Factors | Stanford University
Aula #57 - Extension to Include Global Effects (Advanced) | Stanford University
Aula #58 - Overview of Clustering | Mining of Massive Datasets | Stanford University
Aula #59 - Hierarchical Clustering | Stanford University
Aula #60 - The k Means Algorithm | Stanford University
Aula #61 - The BFR Algorithm | Mining of Massive Datasets | Stanford University
Aula #62 - The CURE Algorithm (Advanced) | Stanford University
Aula #63 - Computational Advertising Bipartite Graph Matching | Stanford
Aula #64 - The AdWords Problem | Mining of Massive Datasets | Stanford University
Aula #65 - The Balance Algorithm | Mining of Massive Datasets | Stanford University
Aula #66 - Generalized Balance (Advanced) | Stanford University
Aula #67 - Support Vector Machines - Introduction | Stanford University
Aula #68 - Support Vector Machines Mathematical Formulation | Stanford
Aula #69 - What is the Margin | Mining of Massive Datasets | Stanford University
Aula #70 - Soft Margin SVMs | Mining of Massive Datasets | Stanford University
Aula #71 - How to Compute the Margin (Advanced) | Stanford University
Aula #72 - Support Vector Machines - Example | Stanford University
Aula #73 - Decision Trees | Mining of Massive Datasets | Stanford University
Aula #74 - How to Construct a Tree | Stanford University
Aula #75 - Information Gain | Mining of Massive Datasets | Stanford University
Aula #76 - Building Decision Trees Using MapReduce (Advanced ) | Stanford
Aula #77 - Decision Trees - Conclusion | Stanford University
Aula #78 - MapReduce Algorithms Part I (Advanced) | Stanford University
Aula #79 - MapReduce Algorithms Part II | (Advanced) | Stanford University
Aula #80 - Theory of MapReduce Algorithms (Advanced) | Stanford University
Aula #81 - Matrix Multiplication in MapReduce (Advanced) | Stanford University
Aula #82 - LSH Families | Mining of Massive Datasets | Stanford University
Aula #83 - More About LSH Families | Stanford University
Aula #84 - Sets and Strings With a High Degree of Similarity (Advanced) | Stanford
Aula #85 - Prefix of a String (Advanced) | Stanford University
Aula #86 - Positions Within Prefixes (Advanced) | Stanford University
Aula #87 - Exploiting Length (Advanced) | Stanford University
Aula #88 - Computing PageRank on Big Graphs (Advanced) | Stanford University
Aula #89 - Topic Specific PageRank | Stanford University
Aula #90 - Application to Measuring Proximity in Graphs | Stanford University
Aula #91 - Hubs and Authorities (Advanced) | Stanford University
Aula #92 - Web Spam - Introduction | Mining of Massive Datasets | Stanford University
Aula #93 - Spam Farms | Mining of Massive Datasets | Stanford University
Aula #94 - Trust Rank | Mining of Massive Datasets | Stanford University

+ Mostrar mais

Escolha um plano:

Plano Gratuito Grátis

20 HORAS AULA
30 dias de acesso grátis*
Prazo de carência de 120 days**
Certificate of completion***
Acesso imediato

Inscrever Agora

Mining Massive Datasets

carga horária

acesso Premium

preço

criado em

Curso completo com certificado!

Por que escolher o Learncafe?

Conteúdo programático:

Conheça o responsável:

Learncafe in English

0 people se inscreveram neste curso.

Módulos & aulas

Module 1: Mining Massive Datasets

Aula em text

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Aula em video

Curso completo
com certificado!

Conheça
o responsável: