1

Lets say i have an array of json objects.

{ firstName: John, Children: ["Maria","Alfred"], married: false }
{ firstName: George, Children: ['**zoekerbergen alfonso the second**','Harvey'], married: false }

{ firstName: Hary, Children: ["Sam","Obama"], married: false }

The pattern is that usually the children are an array of small one word names.

Zoekerbergen alfonso the second is however an anomaly.

Is there someway to learn a pattern of an object then detect anomalies such as those or other such as a person having 1000 children.

Basic pattern learning and detecting various anomalies.

Thank you.

Aflred
  • 4,435
  • 6
  • 30
  • 43

2 Answers2

0

You could create a JSON schema which would describe your JSON data (same as XSD for XML). It allows to define regex patterns for string fields and much more.

JSON Schema is a vocabulary that allows you to validate, annotate, and manipulate JSON documents.

See here https://github.com/json-schema-org/json-schema-spec

vadzim dvorak
  • 939
  • 6
  • 24
  • Yes, i thought about that, but people can make errors regading json-schema so I want some ideas on how to apply anomaly detection on an array of json objects, since those can indicate a problem – Aflred Dec 15 '17 at 09:24
  • well, I believe that it it the best solution. here https://jsonschema.net/ you can generate schema for existing json and then just make your required changes – vadzim dvorak Dec 15 '17 at 09:48
0

Your rules might be too complex for a JSON schema solution, but you can easily create your own set of rules as bellow:

var list = [
    {firstName: 'John', Children: ['Maria', 'Alfred'], married: false},
    {firstName: 'George', Children: ['**zoekerbergen alfonso the second**', 'Harvey'], married: false},
    {firstName: 'Hary', Children: ['Sam', 'Obama', 'Peter'], married: false}
];
// change with your max
var maxChildren = 2;

// add more rules if needed
var anomaliesRules = [
    function(row) {
        return row.Children !== null && row.Children.length > maxChildren;
    },
    function(row) {
        // you can define your own rules per item
        return row.Children.filter(function(child) { return !child.match(/^[A-Z][a-z]+$/); }).length > 0;
    }
];

var checkAllAnomalies = function(item) {
    for (var i = 0; i < anomaliesRules.length; i++) {
        if (anomaliesRules[i](item)) return true;
    }
    return false;
};

var anomalies = list.filter(checkAllAnomalies);
console.log(anomalies);

You get at the end the end all the items matching at least one of your defined anomalies.

adtanasa
  • 89
  • 4