SlideShare a Scribd company logo
1 of 39
Download to read offline
Pierre-Louis Gottfrois
Bastien Murzeau
Apéro Ruby Bordeaux, 8 novembre 2011
• Brève introduction


• Cas pratique


• Map / Reduce
Qu’est ce que mongoDB ?


 mongoDB est une base de donnée
        de type NoSQL,
          sans schéma
       document-oriented
sans-schéma

• Très utile en développements
  ‘agiles’ (itérations, rapidité de modifications,
  flexibilité pour les développeurs)

• Supporte des fonctionnalités qui seraient, en
  BDDs relationnelles :
 • quasi-impossible (stockage d’éléments non finis, ex. tags)

 • trop complexes pour ce qu’elles sont (migrations)
document-oriented
• mongoDB stocke des documents, pas de
  rows

 • les documents sont stockés sous forme de
   JSON; binary JSON

• la syntaxe de requêtage est aussi fournie que
  SQL

• le mécanisme de documents ‘embedded’
  résout bon nombre de problèmes rencontrés
document-oriented

• Les documents sont stockés dans une
 collection, en RoR = model


• une partie des ces données sont indexées
 pour optimiser les performances


• un document n’est pas une poubelle !
stockage de données
        volumineuses
• mongoDB (et autres NoSQL) sont plus
 performantes pour la scalabilité horizontale
 • ajout de serveurs pour augmenter la capacité
   de stockage («sharding»)
 • garantissant ainsi une meilleur disponibilité
 • load-balancing optimisé entre les nodes
 • augmentation transparente pour l’application
Cas pratique
• ORM devient ODM, la gem de référence mongoid
  • ou : mongoMapper, DataMapper
• Création d’une application a base de NoSQL MongoDB
  • rails new nosql
  • edition du Gemfile
    •   gem ‘mongoid’

    •   gem ‘bson_ext’

  • bundle install
  • rails generate mongoid:config
Cas pratique
• edition du config/application.rb
  • #require 'rails/all'
  • require "action_controller/railtie"
  • require "action_mailer/railtie"
  • require "active_resource/railtie"
  • require "rails/test_unit/railtie"
Cas pratique
class Subject
  include Mongoid::Document
  include Mongoid::Timestamps

  has_many :scores,     :as => :scorable, :dependent => :delete, :autosave => true
  has_many :requests,   :dependent => :delete
  belongs_to :author,   :class_name => 'User'




    class Conversation
      include Mongoid::Document
      include Mongoid::Timestamps


      field :public,            :type => Boolean, :default => false

      has_many :scores,         :as => :scorable, :dependent => :delete
      has_and_belongs_to_many   :subjects
      belongs_to :timeline
      embeds_many :messages
Map Reduce
Example


                               A “ticket” collection




{                       {                       {                       {
    “id” : 1,               “id” : 2,               “id” : 3,               “id” : 4,
    “day” : 20111017,       “day” : 20111017,       “day” : 20111017,       “day” : 20111017,
    “checkout” : 100        “checkout” : 42         “checkout” : 215        “checkout” : 73
}                       }                       }                       }
Problematic

• We want to
 • Calculate the ‘checkout’ sum of each object in our
    ticket’s collection

 • Be able to distribute this operation over the network
 • Be fast!
• We don’t want to
 • Go over all objects again when an update is made
Map : emit(checkout)

    The ‘map’ function emit (select) every checkout value
               of each object in our collection


          100                      42                     215                      73



{                       {                       {                       {
    “id” : 1,               “id” : 2,               “id” : 3,               “id” : 4,
    “day” : 20111017,       “day” : 20111017,       “day” : 20111017,       “day” : 20111017,
    “checkout” : 100        “checkout” : 42         “checkout” : 215        “checkout” : 73
}                       }                       }                       }
Reduce : sum(checkout)
                                                  430




                        142                                                 288




          100                        42                       215                        73



{                         {                         {                        {
    “id” : 1,                 “id” : 2,                 “id” : 3,                 “id” : 4,
    “day” : 20111017,         “day” : 20111017,         “day” : 20111017,         “day” : 20111017,
    “checkout” : 100          “checkout” : 42           “checkout” : 215          “checkout” : 73
}                         }                         }                        }
Reduce function

 The ‘reduce’ function apply the algorithmic logic
 for each key/value received from ‘map’ function

This function has to be ‘idempotent’ to be called
      recursively or in a distributed system

reduce(k, A, B) == reduce(k, B, A)
reduce(k, A, B) == reduce(k, reduce(A, B))
Inherently Distributed
                                                  430




                        142                                                 288




          100                        42                       215                        73



{                         {                         {                        {
    “id” : 1,                 “id” : 2,                 “id” : 3,                 “id” : 4,
    “day” : 20111017,         “day” : 20111017,         “day” : 20111017,         “day” : 20111017,
    “checkout” : 100          “checkout” : 42           “checkout” : 215          “checkout” : 73
}                         }                         }                        }
Distributed
Since ‘map’ function emits objects to be reduced
and ‘reduce’ function processes for each emitted
   objects independently, it can be distributed
            through multiple workers.




         map                     reduce
Logaritmic Update

For the same reason, when updating an object, we
    don’t have to reprocess for each obejcts.

   We can call ‘map’ function only on updated
                     objects.
Logaritmic Update
                                                  430




                        142                                                 288




          100                        42                       215                        73



{                         {                         {                        {
    “id” : 1,                 “id” : 2,                 “id” : 3,                 “id” : 4,
    “day” : 20111017,         “day” : 20111017,         “day” : 20111017,         “day” : 20111017,
    “checkout” : 100          “checkout” : 42           “checkout” : 210          “checkout” : 73
}                         }                         }                        }
Logaritmic Update
                                                  430




                        142                                                 288




          100                        42                       210                        73



{                         {                         {                        {
    “id” : 1,                 “id” : 2,                 “id” : 3,                 “id” : 4,
    “day” : 20111017,         “day” : 20111017,         “day” : 20111017,         “day” : 20111017,
    “checkout” : 100          “checkout” : 42           “checkout” : 210          “checkout” : 73
}                         }                         }                        }
Logaritmic Update
                                                  430




                        142                                                 283




          100                        42                       210                        73



{                         {                         {                        {
    “id” : 1,                 “id” : 2,                 “id” : 3,                 “id” : 4,
    “day” : 20111017,         “day” : 20111017,         “day” : 20111017,         “day” : 20111017,
    “checkout” : 100          “checkout” : 42           “checkout” : 210          “checkout” : 73
}                         }                         }                        }
Logarithmic Update
                                                  425




                        142                                                 283




          100                        42                       210                        73



{                         {                         {                        {
    “id” : 1,                 “id” : 2,                 “id” : 3,                 “id” : 4,
    “day” : 20111017,         “day” : 20111017,         “day” : 20111017,         “day” : 20111017,
    “checkout” : 100          “checkout” : 42           “checkout” : 210          “checkout” : 73
}                         }                         }                        }
Let’s do some code!
$> mongo

>   db.tickets.save({   "_id":   1,   "day":   20111017,   "checkout":   100 })
>   db.tickets.save({   "_id":   2,   "day":   20111017,   "checkout":   42 })
>   db.tickets.save({   "_id":   3,   "day":   20111017,   "checkout":   215 })
>   db.tickets.save({   "_id":   4,   "day":   20111017,   "checkout":   73 })

> db.tickets.count()
4

> db.tickets.find()
{ "_id" : 1, "day" : 20111017, "checkout" : 100 }
...

> db.tickets.find({ "_id": 1 })
{ "_id" : 1, "day" : 20111017, "checkout" : 100 }
> var map = function() {
... emit(null, this.checkout)
}

> var reduce = function(key, values) {
... var sum = 0
... for (var index in values) sum += values[index]
... return sum
}
Temporary Collection
> sumOfCheckouts = db.tickets.mapReduce(map, reduce)
{
  "result" : "tmp.mr.mapreduce_123456789_4",
  "timeMills" : 8,
  "counts" : { "input" : 4, "emit" : 4, "output" : 1 },
  "ok" : 1
}

> db.getCollectionNames()
[
  "tickets",
  "tmp.mr.mapreduce_123456789_4"
]

> db[sumOfCheckouts.result].find()
{ "_id" : null, "value" : 430 }
Persistent Collection
> db.tickets.mapReduce(map, reduce, { "out" : "sumOfCheckouts" })

> db.getCollectionNames()
[
  "sumOfCheckouts",
  "tickets",
  "tmp.mr.mapreduce_123456789_4"
]

> db.sumOfCheckouts.find()
{ "_id" : null, "value" : 430 }

> db.sumOfCheckouts.findOne().value
430
Reduce by Date
> var map = function() {
... emit(this.date, this.checkout)
}

> var reduce = function(key, values) {
... var sum = 0
... for (var index in values) sum += values[index]
... return sum
}
> db.tickets.mapReduce(map, reduce, { "out" : "sumOfCheckouts" })

> db.sumOfCheckouts.find()
{ "_id" : 20111017, "value" : 430 }
What we can do
Scored Subjects per
        User
Subject   User   Score
   1       1       2
   1       1       2
   1       2       2
   2       1       2
   2       2      10
   2       2       5
Scored Subjects per
   User (reduced)
Subject   User   Score

  1        1      4

  1        2      2

  2        1      2

  2        2      15
$> mongo

>   db.scores.save({   "_id":   1,   "subject_id":   1,   "user_id":   1,   "score":   2 })
>   db.scores.save({   "_id":   2,   "subject_id":   1,   "user_id":   1,   "score":   2 })
>   db.scores.save({   "_id":   3,   "subject_id":   1,   "user_id":   2,   "score":   2 })
>   db.scores.save({   "_id":   4,   "subject_id":   2,   "user_id":   1,   "score":   2 })
>   db.scores.save({   "_id":   5,   "subject_id":   2,   "user_id":   2,   "score":   10 })
>   db.scores.save({   "_id":   6,   "subject_id":   2,   "user_id":   2,   "score":   5 })

> db.scores.count()
6

> db.scores.find()
{ "_id": 1, "subject_id": 1, "user_id": 1, "score": 2 }
...

> db.scores.find({ "_id": 1 })
{ "_id": 1, "subject_id": 1, "user_id": 1, "score": 2 }
> var map = function() {
... emit([this.user_id, this.subject_id].join("-"), {subject_id:this.subject_id,
... user_id:this.user_id, score:this.score});
}

> var reduce = function(key, values) {
... var result = {user_id:"", subject_id:"", score:0};
... values.forEach(function (value) {result.score += value.score;result.user_id =
... value.user_id;result.subject_id = value.subject_id;});
... return result
}
ReducedScores
                         Collection
> db.scores.mapReduce(map, reduce, { "out" : "reduced_scores" })

> db.getCollectionNames()
[
  "reduced_scores",
  "scores"
]

>   db.reduced_scores.find()
{   "_id" : "1-1", "value" :   {   "user_id"   :   1,   "subject_id"   :   1,   "score"   :   4 } }
{   "_id" : "1-2", "value" :   {   "user_id"   :   1,   "subject_id"   :   2,   "score"   :   2 } }
{   "_id" : "2-1", "value" :   {   "user_id"   :   2,   "subject_id"   :   1,   "score"   :   2 } }
{   "_id" : "2-2", "value" :   {   "user_id"   :   2,   "subject_id"   :   2,   "score"   :   15 } }

> db.reduced_scores.findOne().score
4
Dealing with Rails Query

ruby-1.9.2-p180 :007 > ReducedScores.first
 => #<ReducedScores _id: 1-1, _type: nil, value: {"user_id"=>BSON::ObjectId('...'),
"subject_id"=>BSON::ObjectId('...'), "score"=>4.0}>

ruby-1.9.2-p180 :008 > ReducedScores.where("value.user_id" => u1.id).count
 => 2

ruby-1.9.2-p180 :009 > ReducedScores.where("value.user_id" => u1.id).first.value['score']
 => 4.0

ruby-1.9.2-p180 :010 > ReducedScores.where("value.user_id" => u1.id).last.value['score']
 => 2.0
Questions ?

More Related Content

Viewers also liked

LORD : un outil d'aide au codage des maladies - JFIM - 13 juin 2014
LORD : un outil d'aide au codage des maladies - JFIM - 13 juin 2014LORD : un outil d'aide au codage des maladies - JFIM - 13 juin 2014
LORD : un outil d'aide au codage des maladies - JFIM - 13 juin 2014bndmr
 
MongoDB IoT City Tour EINDHOVEN: Analysing the Internet of Things: Davy Nys, ...
MongoDB IoT City Tour EINDHOVEN: Analysing the Internet of Things: Davy Nys, ...MongoDB IoT City Tour EINDHOVEN: Analysing the Internet of Things: Davy Nys, ...
MongoDB IoT City Tour EINDHOVEN: Analysing the Internet of Things: Davy Nys, ...MongoDB
 
sshGate - RMLL 2011
sshGate - RMLL 2011sshGate - RMLL 2011
sshGate - RMLL 2011Tauop
 
MongoDB Deployment Checklist
MongoDB Deployment ChecklistMongoDB Deployment Checklist
MongoDB Deployment ChecklistMongoDB
 
Automatisez votre gestion de MongoDB avec MMS
Automatisez votre gestion de MongoDB avec MMSAutomatisez votre gestion de MongoDB avec MMS
Automatisez votre gestion de MongoDB avec MMSMongoDB
 
Le monitoring à l'heure de DevOps et Big Data
Le monitoring à l'heure de DevOps et Big DataLe monitoring à l'heure de DevOps et Big Data
Le monitoring à l'heure de DevOps et Big DataClaude Falguiere
 
Plus de flexibilité et de scalabilité chez Bouygues Télécom grâce à MongoDB
Plus de flexibilité et de scalabilité chez Bouygues Télécom grâce à MongoDBPlus de flexibilité et de scalabilité chez Bouygues Télécom grâce à MongoDB
Plus de flexibilité et de scalabilité chez Bouygues Télécom grâce à MongoDBMongoDB
 
L\'authentification forte : Concept et Technologies
L\'authentification forte : Concept et TechnologiesL\'authentification forte : Concept et Technologies
L\'authentification forte : Concept et TechnologiesIbrahima FALL
 
Supervision de réseau informatique - Nagios
Supervision de réseau informatique - NagiosSupervision de réseau informatique - Nagios
Supervision de réseau informatique - NagiosAziz Rgd
 
ElasticSearch : Architecture et Développement
ElasticSearch : Architecture et DéveloppementElasticSearch : Architecture et Développement
ElasticSearch : Architecture et DéveloppementMohamed hedi Abidi
 
Rapport de stage nagios
Rapport de stage nagiosRapport de stage nagios
Rapport de stage nagioshindif
 
MongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and ImplicationsMongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and ImplicationsMongoDB
 
Installer et configurer NAGIOS sous linux
Installer et configurer NAGIOS sous linuxInstaller et configurer NAGIOS sous linux
Installer et configurer NAGIOS sous linuxZakariyaa AIT ELMOUDEN
 
Présentation de ElasticSearch / Digital apéro du 12/11/2014
Présentation de ElasticSearch / Digital apéro du 12/11/2014Présentation de ElasticSearch / Digital apéro du 12/11/2014
Présentation de ElasticSearch / Digital apéro du 12/11/2014Silicon Comté
 
Tirer le meilleur de ses données avec ElasticSearch
Tirer le meilleur de ses données avec ElasticSearchTirer le meilleur de ses données avec ElasticSearch
Tirer le meilleur de ses données avec ElasticSearchSéven Le Mesle
 

Viewers also liked (17)

LORD : un outil d'aide au codage des maladies - JFIM - 13 juin 2014
LORD : un outil d'aide au codage des maladies - JFIM - 13 juin 2014LORD : un outil d'aide au codage des maladies - JFIM - 13 juin 2014
LORD : un outil d'aide au codage des maladies - JFIM - 13 juin 2014
 
MongoDB IoT City Tour EINDHOVEN: Analysing the Internet of Things: Davy Nys, ...
MongoDB IoT City Tour EINDHOVEN: Analysing the Internet of Things: Davy Nys, ...MongoDB IoT City Tour EINDHOVEN: Analysing the Internet of Things: Davy Nys, ...
MongoDB IoT City Tour EINDHOVEN: Analysing the Internet of Things: Davy Nys, ...
 
sshGate - RMLL 2011
sshGate - RMLL 2011sshGate - RMLL 2011
sshGate - RMLL 2011
 
MongoDB Deployment Checklist
MongoDB Deployment ChecklistMongoDB Deployment Checklist
MongoDB Deployment Checklist
 
Automatisez votre gestion de MongoDB avec MMS
Automatisez votre gestion de MongoDB avec MMSAutomatisez votre gestion de MongoDB avec MMS
Automatisez votre gestion de MongoDB avec MMS
 
Le monitoring à l'heure de DevOps et Big Data
Le monitoring à l'heure de DevOps et Big DataLe monitoring à l'heure de DevOps et Big Data
Le monitoring à l'heure de DevOps et Big Data
 
Supervision
SupervisionSupervision
Supervision
 
Plus de flexibilité et de scalabilité chez Bouygues Télécom grâce à MongoDB
Plus de flexibilité et de scalabilité chez Bouygues Télécom grâce à MongoDBPlus de flexibilité et de scalabilité chez Bouygues Télécom grâce à MongoDB
Plus de flexibilité et de scalabilité chez Bouygues Télécom grâce à MongoDB
 
L\'authentification forte : Concept et Technologies
L\'authentification forte : Concept et TechnologiesL\'authentification forte : Concept et Technologies
L\'authentification forte : Concept et Technologies
 
Supervision de réseau informatique - Nagios
Supervision de réseau informatique - NagiosSupervision de réseau informatique - Nagios
Supervision de réseau informatique - Nagios
 
ElasticSearch : Architecture et Développement
ElasticSearch : Architecture et DéveloppementElasticSearch : Architecture et Développement
ElasticSearch : Architecture et Développement
 
Rapport de stage nagios
Rapport de stage nagiosRapport de stage nagios
Rapport de stage nagios
 
PKI par la Pratique
PKI par la PratiquePKI par la Pratique
PKI par la Pratique
 
MongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and ImplicationsMongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and Implications
 
Installer et configurer NAGIOS sous linux
Installer et configurer NAGIOS sous linuxInstaller et configurer NAGIOS sous linux
Installer et configurer NAGIOS sous linux
 
Présentation de ElasticSearch / Digital apéro du 12/11/2014
Présentation de ElasticSearch / Digital apéro du 12/11/2014Présentation de ElasticSearch / Digital apéro du 12/11/2014
Présentation de ElasticSearch / Digital apéro du 12/11/2014
 
Tirer le meilleur de ses données avec ElasticSearch
Tirer le meilleur de ses données avec ElasticSearchTirer le meilleur de ses données avec ElasticSearch
Tirer le meilleur de ses données avec ElasticSearch
 

Similar to Apéro RubyBdx - MongoDB - 8-11-2011

Meet the Experts: Visualize Your Time-Stamped Data Using the React-Based Gira...
Meet the Experts: Visualize Your Time-Stamped Data Using the React-Based Gira...Meet the Experts: Visualize Your Time-Stamped Data Using the React-Based Gira...
Meet the Experts: Visualize Your Time-Stamped Data Using the React-Based Gira...InfluxData
 
MongoDB for Time Series Data: Analyzing Time Series Data Using the Aggregatio...
MongoDB for Time Series Data: Analyzing Time Series Data Using the Aggregatio...MongoDB for Time Series Data: Analyzing Time Series Data Using the Aggregatio...
MongoDB for Time Series Data: Analyzing Time Series Data Using the Aggregatio...MongoDB
 
Operational Intelligence with MongoDB Webinar
Operational Intelligence with MongoDB WebinarOperational Intelligence with MongoDB Webinar
Operational Intelligence with MongoDB WebinarMongoDB
 
MongoDB for Analytics
MongoDB for AnalyticsMongoDB for Analytics
MongoDB for AnalyticsMongoDB
 
MongoDB for Time Series Data Part 2: Analyzing Time Series Data Using the Agg...
MongoDB for Time Series Data Part 2: Analyzing Time Series Data Using the Agg...MongoDB for Time Series Data Part 2: Analyzing Time Series Data Using the Agg...
MongoDB for Time Series Data Part 2: Analyzing Time Series Data Using the Agg...MongoDB
 
You will learn RxJS in 2017
You will learn RxJS in 2017You will learn RxJS in 2017
You will learn RxJS in 2017名辰 洪
 
What's new in GeoServer 2.2
What's new in GeoServer 2.2What's new in GeoServer 2.2
What's new in GeoServer 2.2GeoSolutions
 
The Art Of Readable Code
The Art Of Readable CodeThe Art Of Readable Code
The Art Of Readable CodeBaidu, Inc.
 
IT Days - Parse huge JSON files in a streaming way.pptx
IT Days - Parse huge JSON files in a streaming way.pptxIT Days - Parse huge JSON files in a streaming way.pptx
IT Days - Parse huge JSON files in a streaming way.pptxAndrei Negruti
 
Scaling Up: How Switching to Apache Spark Improved Performance, Realizability...
Scaling Up: How Switching to Apache Spark Improved Performance, Realizability...Scaling Up: How Switching to Apache Spark Improved Performance, Realizability...
Scaling Up: How Switching to Apache Spark Improved Performance, Realizability...Databricks
 
Scaling up data science applications
Scaling up data science applicationsScaling up data science applications
Scaling up data science applicationsKexin Xie
 
Compose Async with RxJS
Compose Async with RxJSCompose Async with RxJS
Compose Async with RxJSKyung Yeol Kim
 
How to Hack a Road Trip with a Webcam, a GSP and Some Fun with Node
How to Hack a Road Trip  with a Webcam, a GSP and Some Fun with NodeHow to Hack a Road Trip  with a Webcam, a GSP and Some Fun with Node
How to Hack a Road Trip with a Webcam, a GSP and Some Fun with Nodepdeschen
 
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.GeeksLab Odessa
 
D3.js - A picture is worth a thousand words
D3.js - A picture is worth a thousand wordsD3.js - A picture is worth a thousand words
D3.js - A picture is worth a thousand wordsApptension
 
Browsers with Wings
Browsers with WingsBrowsers with Wings
Browsers with WingsRemy Sharp
 
Fun with D3.js: Data Visualization Eye Candy with Streaming JSON
Fun with D3.js: Data Visualization Eye Candy with Streaming JSONFun with D3.js: Data Visualization Eye Candy with Streaming JSON
Fun with D3.js: Data Visualization Eye Candy with Streaming JSONTomomi Imura
 

Similar to Apéro RubyBdx - MongoDB - 8-11-2011 (20)

Meet the Experts: Visualize Your Time-Stamped Data Using the React-Based Gira...
Meet the Experts: Visualize Your Time-Stamped Data Using the React-Based Gira...Meet the Experts: Visualize Your Time-Stamped Data Using the React-Based Gira...
Meet the Experts: Visualize Your Time-Stamped Data Using the React-Based Gira...
 
MongoDB for Time Series Data: Analyzing Time Series Data Using the Aggregatio...
MongoDB for Time Series Data: Analyzing Time Series Data Using the Aggregatio...MongoDB for Time Series Data: Analyzing Time Series Data Using the Aggregatio...
MongoDB for Time Series Data: Analyzing Time Series Data Using the Aggregatio...
 
Search@airbnb
Search@airbnbSearch@airbnb
Search@airbnb
 
Operational Intelligence with MongoDB Webinar
Operational Intelligence with MongoDB WebinarOperational Intelligence with MongoDB Webinar
Operational Intelligence with MongoDB Webinar
 
MongoDB for Analytics
MongoDB for AnalyticsMongoDB for Analytics
MongoDB for Analytics
 
MongoDB for Time Series Data Part 2: Analyzing Time Series Data Using the Agg...
MongoDB for Time Series Data Part 2: Analyzing Time Series Data Using the Agg...MongoDB for Time Series Data Part 2: Analyzing Time Series Data Using the Agg...
MongoDB for Time Series Data Part 2: Analyzing Time Series Data Using the Agg...
 
Advancing Scientific Data Support in ArcGIS
Advancing Scientific Data Support in ArcGISAdvancing Scientific Data Support in ArcGIS
Advancing Scientific Data Support in ArcGIS
 
You will learn RxJS in 2017
You will learn RxJS in 2017You will learn RxJS in 2017
You will learn RxJS in 2017
 
What's new in GeoServer 2.2
What's new in GeoServer 2.2What's new in GeoServer 2.2
What's new in GeoServer 2.2
 
The Art Of Readable Code
The Art Of Readable CodeThe Art Of Readable Code
The Art Of Readable Code
 
IT Days - Parse huge JSON files in a streaming way.pptx
IT Days - Parse huge JSON files in a streaming way.pptxIT Days - Parse huge JSON files in a streaming way.pptx
IT Days - Parse huge JSON files in a streaming way.pptx
 
Scaling Up: How Switching to Apache Spark Improved Performance, Realizability...
Scaling Up: How Switching to Apache Spark Improved Performance, Realizability...Scaling Up: How Switching to Apache Spark Improved Performance, Realizability...
Scaling Up: How Switching to Apache Spark Improved Performance, Realizability...
 
Scaling up data science applications
Scaling up data science applicationsScaling up data science applications
Scaling up data science applications
 
Compose Async with RxJS
Compose Async with RxJSCompose Async with RxJS
Compose Async with RxJS
 
How to Hack a Road Trip with a Webcam, a GSP and Some Fun with Node
How to Hack a Road Trip  with a Webcam, a GSP and Some Fun with NodeHow to Hack a Road Trip  with a Webcam, a GSP and Some Fun with Node
How to Hack a Road Trip with a Webcam, a GSP and Some Fun with Node
 
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
 
D3.js - A picture is worth a thousand words
D3.js - A picture is worth a thousand wordsD3.js - A picture is worth a thousand words
D3.js - A picture is worth a thousand words
 
Browsers with Wings
Browsers with WingsBrowsers with Wings
Browsers with Wings
 
R and cpp
R and cppR and cpp
R and cpp
 
Fun with D3.js: Data Visualization Eye Candy with Streaming JSON
Fun with D3.js: Data Visualization Eye Candy with Streaming JSONFun with D3.js: Data Visualization Eye Candy with Streaming JSON
Fun with D3.js: Data Visualization Eye Candy with Streaming JSON
 

Recently uploaded

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 

Recently uploaded (20)

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 

Apéro RubyBdx - MongoDB - 8-11-2011

  • 1. Pierre-Louis Gottfrois Bastien Murzeau Apéro Ruby Bordeaux, 8 novembre 2011
  • 2. • Brève introduction • Cas pratique • Map / Reduce
  • 3. Qu’est ce que mongoDB ? mongoDB est une base de donnée de type NoSQL, sans schéma document-oriented
  • 4. sans-schéma • Très utile en développements ‘agiles’ (itérations, rapidité de modifications, flexibilité pour les développeurs) • Supporte des fonctionnalités qui seraient, en BDDs relationnelles : • quasi-impossible (stockage d’éléments non finis, ex. tags) • trop complexes pour ce qu’elles sont (migrations)
  • 5. document-oriented • mongoDB stocke des documents, pas de rows • les documents sont stockés sous forme de JSON; binary JSON • la syntaxe de requêtage est aussi fournie que SQL • le mécanisme de documents ‘embedded’ résout bon nombre de problèmes rencontrés
  • 6. document-oriented • Les documents sont stockés dans une collection, en RoR = model • une partie des ces données sont indexées pour optimiser les performances • un document n’est pas une poubelle !
  • 7. stockage de données volumineuses • mongoDB (et autres NoSQL) sont plus performantes pour la scalabilité horizontale • ajout de serveurs pour augmenter la capacité de stockage («sharding») • garantissant ainsi une meilleur disponibilité • load-balancing optimisé entre les nodes • augmentation transparente pour l’application
  • 8. Cas pratique • ORM devient ODM, la gem de référence mongoid • ou : mongoMapper, DataMapper • Création d’une application a base de NoSQL MongoDB • rails new nosql • edition du Gemfile • gem ‘mongoid’ • gem ‘bson_ext’ • bundle install • rails generate mongoid:config
  • 9. Cas pratique • edition du config/application.rb • #require 'rails/all' • require "action_controller/railtie" • require "action_mailer/railtie" • require "active_resource/railtie" • require "rails/test_unit/railtie"
  • 10. Cas pratique class Subject include Mongoid::Document include Mongoid::Timestamps has_many :scores, :as => :scorable, :dependent => :delete, :autosave => true has_many :requests, :dependent => :delete belongs_to :author, :class_name => 'User' class Conversation include Mongoid::Document include Mongoid::Timestamps field :public, :type => Boolean, :default => false has_many :scores, :as => :scorable, :dependent => :delete has_and_belongs_to_many :subjects belongs_to :timeline embeds_many :messages
  • 12. Example A “ticket” collection { { { { “id” : 1, “id” : 2, “id” : 3, “id” : 4, “day” : 20111017, “day” : 20111017, “day” : 20111017, “day” : 20111017, “checkout” : 100 “checkout” : 42 “checkout” : 215 “checkout” : 73 } } } }
  • 13. Problematic • We want to • Calculate the ‘checkout’ sum of each object in our ticket’s collection • Be able to distribute this operation over the network • Be fast! • We don’t want to • Go over all objects again when an update is made
  • 14. Map : emit(checkout) The ‘map’ function emit (select) every checkout value of each object in our collection 100 42 215 73 { { { { “id” : 1, “id” : 2, “id” : 3, “id” : 4, “day” : 20111017, “day” : 20111017, “day” : 20111017, “day” : 20111017, “checkout” : 100 “checkout” : 42 “checkout” : 215 “checkout” : 73 } } } }
  • 15. Reduce : sum(checkout) 430 142 288 100 42 215 73 { { { { “id” : 1, “id” : 2, “id” : 3, “id” : 4, “day” : 20111017, “day” : 20111017, “day” : 20111017, “day” : 20111017, “checkout” : 100 “checkout” : 42 “checkout” : 215 “checkout” : 73 } } } }
  • 16. Reduce function The ‘reduce’ function apply the algorithmic logic for each key/value received from ‘map’ function This function has to be ‘idempotent’ to be called recursively or in a distributed system reduce(k, A, B) == reduce(k, B, A) reduce(k, A, B) == reduce(k, reduce(A, B))
  • 17. Inherently Distributed 430 142 288 100 42 215 73 { { { { “id” : 1, “id” : 2, “id” : 3, “id” : 4, “day” : 20111017, “day” : 20111017, “day” : 20111017, “day” : 20111017, “checkout” : 100 “checkout” : 42 “checkout” : 215 “checkout” : 73 } } } }
  • 18. Distributed Since ‘map’ function emits objects to be reduced and ‘reduce’ function processes for each emitted objects independently, it can be distributed through multiple workers. map reduce
  • 19. Logaritmic Update For the same reason, when updating an object, we don’t have to reprocess for each obejcts. We can call ‘map’ function only on updated objects.
  • 20. Logaritmic Update 430 142 288 100 42 215 73 { { { { “id” : 1, “id” : 2, “id” : 3, “id” : 4, “day” : 20111017, “day” : 20111017, “day” : 20111017, “day” : 20111017, “checkout” : 100 “checkout” : 42 “checkout” : 210 “checkout” : 73 } } } }
  • 21. Logaritmic Update 430 142 288 100 42 210 73 { { { { “id” : 1, “id” : 2, “id” : 3, “id” : 4, “day” : 20111017, “day” : 20111017, “day” : 20111017, “day” : 20111017, “checkout” : 100 “checkout” : 42 “checkout” : 210 “checkout” : 73 } } } }
  • 22. Logaritmic Update 430 142 283 100 42 210 73 { { { { “id” : 1, “id” : 2, “id” : 3, “id” : 4, “day” : 20111017, “day” : 20111017, “day” : 20111017, “day” : 20111017, “checkout” : 100 “checkout” : 42 “checkout” : 210 “checkout” : 73 } } } }
  • 23. Logarithmic Update 425 142 283 100 42 210 73 { { { { “id” : 1, “id” : 2, “id” : 3, “id” : 4, “day” : 20111017, “day” : 20111017, “day” : 20111017, “day” : 20111017, “checkout” : 100 “checkout” : 42 “checkout” : 210 “checkout” : 73 } } } }
  • 25. $> mongo > db.tickets.save({ "_id": 1, "day": 20111017, "checkout": 100 }) > db.tickets.save({ "_id": 2, "day": 20111017, "checkout": 42 }) > db.tickets.save({ "_id": 3, "day": 20111017, "checkout": 215 }) > db.tickets.save({ "_id": 4, "day": 20111017, "checkout": 73 }) > db.tickets.count() 4 > db.tickets.find() { "_id" : 1, "day" : 20111017, "checkout" : 100 } ... > db.tickets.find({ "_id": 1 }) { "_id" : 1, "day" : 20111017, "checkout" : 100 }
  • 26. > var map = function() { ... emit(null, this.checkout) } > var reduce = function(key, values) { ... var sum = 0 ... for (var index in values) sum += values[index] ... return sum }
  • 27. Temporary Collection > sumOfCheckouts = db.tickets.mapReduce(map, reduce) { "result" : "tmp.mr.mapreduce_123456789_4", "timeMills" : 8, "counts" : { "input" : 4, "emit" : 4, "output" : 1 }, "ok" : 1 } > db.getCollectionNames() [ "tickets", "tmp.mr.mapreduce_123456789_4" ] > db[sumOfCheckouts.result].find() { "_id" : null, "value" : 430 }
  • 28. Persistent Collection > db.tickets.mapReduce(map, reduce, { "out" : "sumOfCheckouts" }) > db.getCollectionNames() [ "sumOfCheckouts", "tickets", "tmp.mr.mapreduce_123456789_4" ] > db.sumOfCheckouts.find() { "_id" : null, "value" : 430 } > db.sumOfCheckouts.findOne().value 430
  • 30. > var map = function() { ... emit(this.date, this.checkout) } > var reduce = function(key, values) { ... var sum = 0 ... for (var index in values) sum += values[index] ... return sum }
  • 31. > db.tickets.mapReduce(map, reduce, { "out" : "sumOfCheckouts" }) > db.sumOfCheckouts.find() { "_id" : 20111017, "value" : 430 }
  • 33. Scored Subjects per User Subject User Score 1 1 2 1 1 2 1 2 2 2 1 2 2 2 10 2 2 5
  • 34. Scored Subjects per User (reduced) Subject User Score 1 1 4 1 2 2 2 1 2 2 2 15
  • 35. $> mongo > db.scores.save({ "_id": 1, "subject_id": 1, "user_id": 1, "score": 2 }) > db.scores.save({ "_id": 2, "subject_id": 1, "user_id": 1, "score": 2 }) > db.scores.save({ "_id": 3, "subject_id": 1, "user_id": 2, "score": 2 }) > db.scores.save({ "_id": 4, "subject_id": 2, "user_id": 1, "score": 2 }) > db.scores.save({ "_id": 5, "subject_id": 2, "user_id": 2, "score": 10 }) > db.scores.save({ "_id": 6, "subject_id": 2, "user_id": 2, "score": 5 }) > db.scores.count() 6 > db.scores.find() { "_id": 1, "subject_id": 1, "user_id": 1, "score": 2 } ... > db.scores.find({ "_id": 1 }) { "_id": 1, "subject_id": 1, "user_id": 1, "score": 2 }
  • 36. > var map = function() { ... emit([this.user_id, this.subject_id].join("-"), {subject_id:this.subject_id, ... user_id:this.user_id, score:this.score}); } > var reduce = function(key, values) { ... var result = {user_id:"", subject_id:"", score:0}; ... values.forEach(function (value) {result.score += value.score;result.user_id = ... value.user_id;result.subject_id = value.subject_id;}); ... return result }
  • 37. ReducedScores Collection > db.scores.mapReduce(map, reduce, { "out" : "reduced_scores" }) > db.getCollectionNames() [ "reduced_scores", "scores" ] > db.reduced_scores.find() { "_id" : "1-1", "value" : { "user_id" : 1, "subject_id" : 1, "score" : 4 } } { "_id" : "1-2", "value" : { "user_id" : 1, "subject_id" : 2, "score" : 2 } } { "_id" : "2-1", "value" : { "user_id" : 2, "subject_id" : 1, "score" : 2 } } { "_id" : "2-2", "value" : { "user_id" : 2, "subject_id" : 2, "score" : 15 } } > db.reduced_scores.findOne().score 4
  • 38. Dealing with Rails Query ruby-1.9.2-p180 :007 > ReducedScores.first => #<ReducedScores _id: 1-1, _type: nil, value: {"user_id"=>BSON::ObjectId('...'), "subject_id"=>BSON::ObjectId('...'), "score"=>4.0}> ruby-1.9.2-p180 :008 > ReducedScores.where("value.user_id" => u1.id).count => 2 ruby-1.9.2-p180 :009 > ReducedScores.where("value.user_id" => u1.id).first.value['score'] => 4.0 ruby-1.9.2-p180 :010 > ReducedScores.where("value.user_id" => u1.id).last.value['score'] => 2.0