Showing posts with label Big Data. Show all posts
Showing posts with label Big Data. Show all posts

Friday, 5 August 2016

MongoDB Jolie Connector Part 2: Application Example

In the previous post I looked at the basic use of Jolie's MongoDB connector with an overview of its characteristic and it basic uses. In this post we are looking at a real use of the connector, the obvious choice would have been the implementation of Sales and Distribution application; yet in the company where I work such a need has not arisen. The opportunity to use MongoDB and jolie have arisen when approving the development of a Plant Management software. Production plants are often very complex structures with hundreds of  components. The best way to represent such complex structures is by using a tree representation  where the hierarchical nature of plant and its sub component can be best expressed. The tree data structure also fits perfectly with jolie data structures and with MongoDB document approach.
Therefore the first step in the development process was to define a "document" or better a tree data structure that represents my company production plants.  
     
 type ProductionSiteType {  
  .productionSite:void{  
   .productionSiteName:string  
   .productionSiteErpCode:string  
   .mainLine*:void{  
       .mainLineName:string  
       .mainLineErpCode:string  
       .subLine*:void{  
         .subLineName:string  
         .subLineErpCode:string  
          .equip*:void{  
           .equipName:string  
           .equipErpCode:string  
          }  
         }  
        }  
       }  
    }  
 }  

The second issue was to valuate how flexible this data structure is? can it adapt to changes required by the business , with the insertion of  new information. Let us suppose that the business requires to specify a mainLineManager (under the mainLine node ) with his/her contact details, and also it is required to supplierMaterialCode. The compulsiveness of the information can be now handled using the jolie  node's cardinality.

       .mainLineManager:void{  
          .managerName:string  
          .email:string  
          .phone:string        
       }  

The insertion of this node node will require the data within the collection to be  processed again

        .equip*:void{  
           .supplierMaterialCode?:string
           .equipName:string  
           .equipSapCode:string  
          }  

Where in this case there will be no need for the collection to be processed again
So far the reader may think that the writer has only shown  how to represent a possible plant structure using jolie.
The advantage of the document approach will become evident when looking at the interface and the operation

interface MaintanaceServiceInterface{
   RequestResponse:
    addProductionSite( ProductionSiteType )(AddProductionSiteResponse)
 }

It can be seen that any change in the document is propagated at the addProductionSite operation, and it is also evident looking at the implementation,

   [addProductionSite(request)(response){
    scope (addProductionSiteScope){
     install (default => valueToPrettyString@StringUtils(addProductionSiteScope)(s);
              println@Console(s)());
              q.collection = "SiteStructure";
              q.document << request;
              insert@MongoDB(q)(responseq)
     }
     }]{nullProcess}

The same can be said for the reading of plants structures


type GetMainLineRequest: void{
     .productionSiteErpCode?:string
}
type GetMainLineResponse:void{
  .document*:productionSiteType
}
  
interface MaintanaceServiceInterface{
   RequestResponse:
    addProductionSite( ProductionSiteType )(AddProductionSiteResponse),
    getProductionSite(GetProductionSiteRequest)(GetProductionSiteResponse)
 }


With the following implementation


  [getProductionSite(request)(response){
      scope (getProductionSiteScope){
       install (default => valueToPrettyString@StringUtils(getProductionSiteScope)(s);
                          println@Console(s)());
                q.collection = "SiteStructure";
                if ( is_defined (request.productionSiteErpCode)){
                   q.filter << request;
                   q.filter = "{'productionSite.productionSiteErpCode': '$productionSiteErpCode'}"
                };
                if (DEBUG_MODE){
                  valueToPrettyString@StringUtils(q)(s);
                  println@Console(s)()
                };
                query@MongoDB(q)(responseq);
                if (DEBUG_MODE){
                  valueToPrettyString@StringUtils(responseq)(s);
                  println@Console(s)()
                };
                response << responseq
       }
      }]{nullProcess}


Once again it can be noticed that the filter is expressed as MongoDB standard query language with the injection of the data directly from the jolie type.
It has been also necessary to refactor some of the subNode such as the equip defining a not inline type



       type equipType:void{  
           .equipName:string  
           .equipSapCode:string  
          }

 type AddEquipmentRequest {
     .productionSiteErpCode:string 
     .mainLineErpCode:string
     .subLineErpCode:string
     .equip: equipType     
  }
  interface MaintanaceServiceInterface{
      RequestResponse:
                 addProductionSite( ProductionSiteType )(AddProductionSiteResponse),
                 getProductionSite(GetProductionSiteRequest)(GetProductionSiteResponse),
                 addEquipment( AddEquipmentRequest )(AddProductionSiteResponse)
    } 


By defining a new jolie complex type it has been possible centralizing the definition of the equipment that can be used both as type for further operations or directly as new document for a new collection
An other example of the usefulness of the type/document  driven design is the access to subNodes content as individual nodes


type GetMainLineRequest: void{
     .productionSiteErpCode?:string
  }
type GetMainLineResponse:void{
   .document*:productionSiteType
}
  interface MaintanaceServiceInterface{
      RequestResponse:
                 addProductionSite( ProductionSiteType )(AddProductionSiteResponse),
                 getProductionSite(GetProductionSiteRequest)(GetProductionSiteResponse),
                 addEquipment( AddEquipmentRequest )(AddProductionSiteResponse)
    } 


 [getMainLine(request)(response){
   scope (getMainLineScope){
    install (default => valueToPrettyString@StringUtils(getMainLineScope)(s);
                       println@Console(s)());
         q.collection = "SiteStructure";
         q.filter[0] = "{ $match : { 'productionSite.productionSiteErpCode' :'$productionSiteErpCode' } }";
         q.filter[0].productionSiteErpCode = request.productionSiteErpCode;
         q.filter[1] = "{$unwind : '$productionSite'}";
         q.filter[2] = "{$unwind : '$productionSite.mainLine'}";
         if (DEBUG_MODE){
           valueToPrettyString@StringUtils(q)(s);
           println@Console(s)()
         };
            aggregate@MongoDB(q)(responseq);
         if(DEBUG_MODE){
              valueToPrettyString@StringUtils(responseq)(s);
              println@Console(s)()
            };
        response << responseq


  }
   }]{nullProcess}


By using the aggregation operation it has been possible to extract the content of the subNode mainLine for a specific plant and creating the correlation one document for each mainLine.

With in the limit of blog post I hope the reader will appreciate how jolie and jolie's MongoDB connector are a valid  technology to approach microservices  based data intensive applications.

Monday, 4 July 2016

MongoDB Jolie Connector Part 1

One of the buzz words of the IT community it is definitely "Big Data", yet like it often happens the actual definition is somewhat vague, is "Big Data" a question of volume of data or also a question of performances.The answer can be found ,in my opinion, in the definition provided by Gartner: "Big Data:is high-volume, high-velocity and/or high-variety information assets that demand cost-effective, innovative forms of information processing that enable enhanced insight, decision making, and process automation"
Does it means that to do "Big Data" we need to have a new DB technology: well  the short answer is yes we do, traditional DB technologies poorly fit  with some of the "Big Data" requirements. This new challenge has created a proliferation of new DB technologies and vendors with different approaches and solution. A good review on the several option has been provided by the following site Database technology comparison-matrix
MongoDB it is one of the possible technology choices, it is my opinion that with its document approach makes data modeling a much more linear approach easing also the not always comunication between IT and not IT members of a company.
The next question  is: "Can microservices play a role in a Big Data scenario?". It is my opinion  that microservices  can play an important role in developing data handling in a "Big Data" landscape. Expanding on the micro services orchestration concept, it can imagined a scenario where a poll of microservices  can be used as data orchestration where new "aggregated data" are created by using individual function exposed by the microservices. This will certainly cover two of characteristic of Gartner definition:
  1. high-variety information:  The orchestration of original data into new aggregated data will avoid the creation of new  database bounded table or collection 
  2.  cost-effective: microservices are effective as much we are able to decompose correctly the problem on hand
On the purpose of  bringing Jolie within the Big Data discourse a MongoDB connector has been developed: the choice of MongoDB as first Big Data technology approached  by the Jolie's team has not been accidental but driven by the tree like data representation used by MongoDB that extremity similar if not identical to the way Jolie represents complex data in its  complex type definition.
The connector has been developed externally the normal Jolie Language development but it is developed by the Jolie team members and it is also responds to a specific demands of a growing Jolie programming community.
The developing team has adopted a top down approach when developing the connector starting from an analysis of a set of requirement passing through the interface definition to arrive to some pilot project ( which  result will be presented further on ).
The identified requirement were  the following
  1.  Preserve the not SQL nature of MongoDB
  2. Give to the connector user a simple and instinctive interface to the MongoDB world
  3. Preserve some of the native aggregation and correlation characteristic of MongoDB
From this three requirements the following considerations have been thought 
  1. The connector will have to have all the CRUD operation 
  2. The connector will use the native query representational (JSON) 
  3. The connector will provide access to MongoDB aggregation capability
  4. The naming convention of  service operation types replicate the terms used by MongoDB,
and resulted in the following interface:


interface MongoDBInterface {
  RequestResponse:
  connect (ConnectRequest)(ConnectResponse) throws MongoException ,
  query   (QueryRequest)(QueryResponse)   throws MongoException JsonParseException ,
  insert  (InsertRequest)(InsertResponse)   throws MongoException JsonParseException ,
  update  (UpdateRequest)(UpdateResponse)   throws MongoException JsonParseException ,
  delete  (DeleteRequest)(DeleteResponse)   throws MongoException JsonParseException ,
  aggregate (AggregateRequest)(AggregateResponse)   throws MongoException JsonParseException
}



Like mentioned before the connector aims to propose a simple to use interface:
Starting from insert operation

q.collection = "CustomerSales";
    with (q.document){
          .name    = "Lars";
          .surname = "Larsesen";
          .code = "LALA01";
          .age = 28;
           with (.purchase){
             .ammount = 30.12;
             .date.("@type")="Date";
             .date= currentTime;
             .location.street= "Mongo road";
             .location.number= 2
          }
     };
As can be seen the document can be inserted int the "CustomerSales" collection by simply defining the document to be insert as Jolie structured variable.

The query operation or any other operation that require a subset filtering ( update , delete ,aggregate )

q.filter = "{'purchase.date':{$lt:'$date'}}";
q.filter.date =long("1463572271651");
q.filter.date.("@type")="Date";

notice the use of  '$nameOfVariable' to dynamically inject the value in search filter
Using the sane representation it can be inject the value for an update operation

q.collection = "CustomerSales";
q.filter = "{surname: '$surname'}";
q.filter.surname = "Larsesen";
q.documentUpdate = "{$set:{age:'$age'}}";
q.documentUpdate.age= 22; 

and so it goes for the aggregation:

q.collection = "CustomerSales";
q.filter = "{$group:{ _id : '$surname', total:{$sum : 1}}}";

Use and limitation of the connector

The connector can be downloaded here ( Jolie Custom services Installation ), the current release has to be consider a Beta release, therefore prone to present  imperfections and bugs
Some of them are already known and are on the way to be corrected:
  1. Logoff operation missing
  2. Handling  MongoDB security
  3. DateTime storing error[UTC DateTime
the user must consider the following native data type mapping

Mongo Type Jolie Type Detail
Double double
int32 int
int64 int
String sting
DateTime long with child node ("@type")="Date"
ObjectId long