Mining Massive Datasets

MATRICULE-SE: Me Inscrever

curso gratuito

Mining Massive Datasets

We introduce the participant to modern distributed file systems and MapReduce, including what distinguishes good MapReduce algorithms from good algorithms in general. The rest of the course is devoted to algorithms for extracting models and information from large datasets. Participants will learn how Google's PageRank algorithm models importance of Web pages and some of the many extensions that have been used for a variety of purposes. We'll cover locality-sensitive hashing, a bit of magic that allows you to find similar items in a set of items so large you cannot possibly compare each pair. When data is stored as a very large, sparse matrix, dimensionality reduction is often a good way to model the data, but standard approaches do not scale well; we'll talk about efficient approaches. Many other large-scale algorithms are covered as well, as outlined in the course syllabus.

Saiba mais.

Comece agora gratuitamente!

Este curso está disponível para assinantes! Conheça os Planos.

Certificado digital ou impresso disponível para compra avulsa, após conclusão

carga horária

20h

acesso Premium

7 days

criado em

11/12/2017

Curso completo
com certificado!

Ao concluir este curso, você poderá adquirir um certificado digital avulso para download ao custo de $ 9,20

Aumente as suas chances de ser contratado qualificando suas habilidades com certificados.

Por que escolher o Learncafe?

Aqui no Learncafe você tem acesso ao seu curso a hora que quiser. Os cursos ficam disponíveis 24 horas por dia, todos os dias da semana.

Além de ser uma plataforma fácil de usar, em qualquer aparelho com acesso à internet, você ainda poderá adquirir um certificado digital para download ao concluir o seu curso, ao custo de $ 9,20

Conteúdo programático:

Aula #1 - 1. Distributed File Systems | Stanford University
Aula #2 - 2. The MapReduce Computational Model | Stanford University
Aula #3 - 3. Scheduling and Data Flow | Stanford University
Aula #4 - 4. Combiners and Partition Functions (Advanced) | Stanford University
Aula #5 - 5. Link Analysis and PageRank | Stanford University
Aula #6 - 6. PageRank The Flow Formulation | Stanford University
Aula #7 - 7. PageRank The Matrix Formulation | Stanford University
Aula #8 - 8. PageRank Power Iteration | Stanford University
Aula #9 - PageRank - The Google Formulation | Stanford University
Aula #10 - Why Teleports Solve the Problem | Stanford University
Aula #11 - How we Really Compute PageRank | Stanford University
Aula #12 - Finding Similar Sets | Stanford University
Aula #13 - Minhashing | Mining of Massive Datasets | Stanford University
Aula #14 - Locality Sensitive Hashing | Stanford University
Aula #15 - Applications of LSH | Stanford University
Aula #16 - Fingerprint Matching | Stanford University
Aula #17 - Finding Duplicate News Articles | Stanford University
Aula #18 - Distance Measures | Mining of Massive Datasets | Stanford University
Aula #19 - Nearest Neighbor Learning | Stanford University
Aula #20 - Frequent Itemsets | Mining of Massive Datasets | Stanford University
Aula #21 - A Priori Algorithm | Mining of Massive Datasets | Stanford University
Aula #22 - Improvements to A Priori (Advanced) | Stanford University
Aula #23 - All or Most Frequent Itemsets in 2 Passes (Advanced) | Stanford
Aula #24 - Community Detection in Graphs - Motivation | Stanford University
Aula #25 - The Affiliation Graph Model | Stanford University
Aula #26 - From AGM to BIGCLAM | Stanford University
Aula #27 - Solving the BIGCLAM | Mining of Massive Datasets | Stanford University
Aula #28 - Detecting Communities as Clusters (Advanced) | Stanford University
Aula #29 - What Makes a Good Cluster (Advanced) | Stanford University
Aula #30 - The Graph Laplacian Matrix (Advanced) | Stanford University
Aula #31 - Examples of Eigendecompositions of Graphs (Advanced) | Stanford
Aula #32 - Defining the Graph Laplacian (Advanced) | Stanford University
Aula #33 - Spectral Graph Partitioning Finding a Partition (Advanced) | Stanford
Aula #34 - Spectral Clustering Three Steps (Advanced) | Stanford University
Aula #35 - Analysis of Large Graphs Trawling (Advanced) | Stanford University
Aula #36 - Mining Data Streams | Mining of Massive Datasets | Stanford University
Aula #37 - Counting 1 's (Advanced) | Mining of Massive Datasets | Stanford University
Aula #38 - Bloom Filters | Mining of Massive Datasets | Stanford University
Aula #39 - Sampling a Stream | Mining of Massive Datasets | Stanford University
Aula #40 - Counting Distinct Elements (Advanced) | Stanford University
Aula #41 - Overview of Recommender Systems | Stanford University
Aula #42 - Content Based Recommendations | Stanford University
Aula #43 - Collaborative Filtering | Stanford University
Aula #44 - Implementing Collaborative Filtering (Advanced) | Stanford University
Aula #45 - Evaluating Recommender Systems | Stanford University
Aula #46 - Dimensionality Reduction - Introduction | Stanford University
Aula #47 - Singular Value Decomposition | Stanford University
Aula #48 - Dimensionality Reduction with SVD | Stanford University
Aula #49 - SVD Gives the Best Low Rank Approximation (Advanced) | Stanford
Aula #50 - SVD Example and Conclusion | Stanford University
Aula #51 - CUR Decomposition (Advanced) | Stanford University
Aula #52 - The CUR Algorithm (Advanced) | Stanford University
Aula #53 - Discussion of the CUR Method | Stanford University
Aula #54 - Latent Factor Models | Stanford University
Aula #55 - Latent Factor Recommender System | Stanford University
Aula #56 - Finding the Latent Factors | Stanford University
Aula #57 - Extension to Include Global Effects (Advanced) | Stanford University
Aula #58 - Overview of Clustering | Mining of Massive Datasets | Stanford University
Aula #59 - Hierarchical Clustering | Stanford University
Aula #60 - The k Means Algorithm | Stanford University
Aula #61 - The BFR Algorithm | Mining of Massive Datasets | Stanford University
Aula #62 - The CURE Algorithm (Advanced) | Stanford University
Aula #63 - Computational Advertising Bipartite Graph Matching | Stanford
Aula #64 - The AdWords Problem | Mining of Massive Datasets | Stanford University
Aula #65 - The Balance Algorithm | Mining of Massive Datasets | Stanford University
Aula #66 - Generalized Balance (Advanced) | Stanford University
Aula #67 - Support Vector Machines - Introduction | Stanford University
Aula #68 - Support Vector Machines Mathematical Formulation | Stanford
Aula #69 - What is the Margin | Mining of Massive Datasets | Stanford University
Aula #70 - Soft Margin SVMs | Mining of Massive Datasets | Stanford University
Aula #71 - How to Compute the Margin (Advanced) | Stanford University
Aula #72 - Support Vector Machines - Example | Stanford University
Aula #73 - Decision Trees | Mining of Massive Datasets | Stanford University
Aula #74 - How to Construct a Tree | Stanford University
Aula #75 - Information Gain | Mining of Massive Datasets | Stanford University
Aula #76 - Building Decision Trees Using MapReduce (Advanced ) | Stanford
Aula #77 - Decision Trees - Conclusion | Stanford University
Aula #78 - MapReduce Algorithms Part I (Advanced) | Stanford University
Aula #79 - MapReduce Algorithms Part II | (Advanced) | Stanford University
Aula #80 - Theory of MapReduce Algorithms (Advanced) | Stanford University
Aula #81 - Matrix Multiplication in MapReduce (Advanced) | Stanford University
Aula #82 - LSH Families | Mining of Massive Datasets | Stanford University
Aula #83 - More About LSH Families | Stanford University
Aula #84 - Sets and Strings With a High Degree of Similarity (Advanced) | Stanford
Aula #85 - Prefix of a String (Advanced) | Stanford University
Aula #86 - Positions Within Prefixes (Advanced) | Stanford University
Aula #87 - Exploiting Length (Advanced) | Stanford University
Aula #88 - Computing PageRank on Big Graphs (Advanced) | Stanford University
Aula #89 - Topic Specific PageRank | Stanford University
Aula #90 - Application to Measuring Proximity in Graphs | Stanford University
Aula #91 - Hubs and Authorities (Advanced) | Stanford University
Aula #92 - Web Spam - Introduction | Mining of Massive Datasets | Stanford University
Aula #93 - Spam Farms | Mining of Massive Datasets | Stanford University
Aula #94 - Trust Rank | Mining of Massive Datasets | Stanford University
+ Mostrar mais

Conheça
o responsável:

Learncafe in English

Learning is never too much. We create this profile so you can access various free courses. From the available material, you can acquire new knowledge on topics such as: education, health, among other areas. All works and materials have the rights reserved to their respective authors.

ver perfil ver cursos
0 people se inscreveram neste curso.

Avaliação geral do curso:


Módulos & aulas

Module 1: Mining Massive Datasets

ver todos

Escolha um plano:

Plano Gratuito Grátis
  • 20 HORAS AULA
  • 30 dias de acesso grátis*
  • Prazo de carência de 120 days**
  • Certificate of completion***
  • Acesso imediato
Inscrever Agora
* O prazo de acesso ao conteúdo do curso é válido durante a vigência da mensalidade ou compra avulsa.
** O prazo de carência refere-se ao tempo total de espera para que um usuário possa iniciar um outro curso na plataforma.
*** O certificado de conclusão do curso é oferecido separadamente para compra avulsa em dois formatos: digital para download e impresso via Correios.

Comece a estudar em poucos cliques:

O conhecimento que você buscava para melhorar a sua
carreira está a poucos cliques de distância!

Assine Agora

Mais sobre a Learncafe

190.000 horas-aula de estudo
230.000 vídeos disponíveis
27.000 exercícios de fixação
5.500 cursos disponíveis
Ei, espera! Que tal você concorrer a 20 mil?