One of the buzz words of the IT community it is definitely "Big Data", yet like it often happens the actual definition is somewhat vague, is "Big Data" a question of volume of data or also a question of performances.The answer can be found ,in my opinion, in the definition provided by Gartner: "Big Data:is high-volume, high-velocity and/or high-variety information assets that demand cost-effective, innovative forms of information processing that enable enhanced insight, decision making, and process automation".
Does it means that to do "Big Data" we need to have a new DB technology: well the short answer is yes we do, traditional DB technologies poorly fit with some of the "Big Data" requirements. This new challenge has created a proliferation of new DB technologies and vendors with different approaches and solution. A good review on the several option has been provided by the following site Database technology comparison-matrix.
MongoDB it is one of the possible technology choices, it is my opinion that with its document approach makes data modeling a much more linear approach easing also the not always comunication between IT and not IT members of a company.
The next question is: "Can microservices play a role in a Big Data scenario?". It is my opinion that microservices can play an important role in developing data handling in a "Big Data" landscape. Expanding on the micro services orchestration concept, it can imagined a scenario where a poll of microservices can be used as data orchestration where new "aggregated data" are created by using individual function exposed by the microservices. This will certainly cover two of characteristic of Gartner definition:
- high-variety information: The orchestration of original data into new aggregated data will avoid the creation of new database bounded table or collection
- cost-effective: microservices are effective as much we are able to decompose correctly the problem on hand
On the purpose of bringing Jolie within the Big Data discourse a MongoDB connector has been developed: the choice of MongoDB as first Big Data technology approached by the Jolie's team has not been accidental but driven by the tree like data representation used by MongoDB that extremity similar if not identical to the way Jolie represents complex data in its complex type definition.
The connector has been developed externally the normal Jolie Language development but it is developed by the Jolie team members and it is also responds to a specific demands of a growing Jolie programming community.
The developing team has adopted a top down approach when developing the connector starting from an analysis of a set of requirement passing through the interface definition to arrive to some pilot project ( which result will be presented further on ).
The identified requirement were the following
- Preserve the not SQL nature of MongoDB
- Give to the connector user a simple and instinctive interface to the MongoDB world
- Preserve some of the native aggregation and correlation characteristic of MongoDB
From this three requirements the following considerations have been thought
- The connector will have to have all the CRUD operation
- The connector will use the native query representational (JSON)
- The connector will provide access to MongoDB aggregation capability
- The naming convention of service operation types replicate the terms used by MongoDB,
and resulted in the following interface:
interface MongoDBInterface {
RequestResponse:
connect (ConnectRequest)(ConnectResponse) throws MongoException ,
query (QueryRequest)(QueryResponse) throws MongoException JsonParseException ,
insert (InsertRequest)(InsertResponse) throws MongoException JsonParseException ,
update (UpdateRequest)(UpdateResponse) throws MongoException JsonParseException ,
delete (DeleteRequest)(DeleteResponse) throws MongoException JsonParseException ,
aggregate (AggregateRequest)(AggregateResponse) throws MongoException JsonParseException
}
Starting from insert operation
q.collection = "CustomerSales";
with (q.document){
.name = "Lars";
.surname = "Larsesen";
.code = "LALA01";
.age = 28;
with (.purchase){
.ammount = 30.12;
.date.("@type")="Date";
.date= currentTime;
.location.street= "Mongo road";
.location.number= 2
}
};
As can be seen the document can be inserted int the "CustomerSales" collection by simply defining the document to be insert as Jolie structured variable.
The query operation or any other operation that require a subset filtering ( update , delete ,aggregate )
q.filter = "{'purchase.date':{$lt:'$date'}}";
q.filter.date =long("1463572271651");
q.filter.date.("@type")="Date";
Using the sane representation it can be inject the value for an update operation
q.collection = "CustomerSales";
q.filter = "{surname: '$surname'}";
q.filter.surname = "Larsesen";
q.documentUpdate = "{$set:{age:'$age'}}";
q.documentUpdate.age= 22;
and so it goes for the aggregation:
q.collection = "CustomerSales";
q.filter = "{$group:{ _id : '$surname', total:{$sum : 1}}}";
Use and limitation of the connector
The connector can be downloaded here ( Jolie Custom services Installation ), the current release has to be consider a Beta release, therefore prone to present imperfections and bugs
Some of them are already known and are on the way to be corrected:
- Logoff operation missing
- Handling MongoDB security
- DateTime storing error[UTC DateTime]
Mongo Type | Jolie Type | Detail |
---|---|---|
Double | double | |
int32 | int | |
int64 | int | |
String | sting | |
DateTime | long | with child node ("@type")="Date" |
ObjectId | long |
No comments:
Post a Comment