Setup "dataspectsSystem on DigitalOcean"

From dataspects::Wiki
C190110121700
Jump to navigation Jump to search


docker-compose.yml

/usr/local/etc/dataspects/docker-compose.yml specifies all Docker containers and volumes.

Docker volumes

The Docker volume dataspects_webservice_customer_data is volumed into the webservice and sidekiq services at /usr/dataspectsSoftware/ inside the containers.

Images and containers

Both containers for services webservice and sidekiq:

  1. are run from the C190110142823's Docker image hosted at 550143289598.dkr.ecr.eu-central-1.amazonaws.com/dataspects-webservice which contains the Ruby gem https://github.com/dataspects/dataspects
  2. have these environment variables set:
    • DATASPECTS_SEARCH_CONFIG_FILE: /usr/dataspectsSoftware/dataspectsSearch_config.yml
    • DATASPECTS_PLUGINS_FOLDER: /usr/dataspectsSoftware/PLUGINS

Indexing ResourceSilos

DATASPECTS_PLUGINS_FOLDER: /usr/dataspectsSoftware/PLUGINS is the location where plugins are placed. Code in such plugins add and/or overwrite methods in https://github.com/dataspects/dataspects.

Plugins can be:

Indexing file system directories

See Feature "Indexing file system directories".

Indexing ResourceSilos providing an API

ResourceSilos that provide an API are indexed by running a dataspects workload script dataspects by RubyGem "dataspects".

Here's an example of a workload script:

user@workstation:~/myIndexingWorkspace$ tree
.
├── profiles.yml
└── myIndexer.rb
# myIndexer.rb
module Dataspects
  class MyIndexer < Indexer

    # STEP 1: Specify basic indexing components
    def initialize
      @sProfilesURL = 'profiles.yml'
      # The following labels refer items in profiles.yml
        @sTIKAServerLabel = 'dataspectsSystemTIKAServer'
        @sElasticsearchClusterName = "dataspectsSystemESCluster"
        @sResourceSiloName = "localmediawiki"
      @sIndexName = "smwckindex-0"
      super
      # Specify a ResourceSilo, e.g. a MediaWiki
      @oSMW = SemanticMediaWiki.new(@oProfiles, @sResourceSiloName, @hOptions)
    end

    # STEP 2: Specify which Resources to index from the
    #         ResourceSilo specified to index
    def ajsonObjectURIs
    end

    # STEP 3: Iterate through all Resources selected for indexing
    #         and apply entitizers (splitting a Resource into a single
    #         or multiple entities) and documentors (creating an
    #         ElasticsearchJSONDocument from an Entity)
    def store_RESOURCENAME(jsonResourceURI)
    end

  end
end

Federated search engine

webservice serves the endpoints specified in https://github.com/dataspects/dataspects-webservice/blob/master/config/routes.rb at https://webservice.dataspects.com.