Setup "dataspectsSystem on DigitalOcean"

From dataspects::Wiki
Jump to navigation Jump to search


/usr/local/etc/dataspects/docker-compose.yml specifies all Docker containers and volumes.

Docker volumes

The Docker volume dataspects_webservice_customer_data is volumed into the webservice and sidekiq services at /usr/dataspectsSoftware/ inside the containers.

Images and containers

Both containers for services webservice and sidekiq:

  1. are run from the C190110142823's Docker image hosted at which contains the Ruby gem
  2. have these environment variables set:
    • DATASPECTS_SEARCH_CONFIG_FILE: /usr/dataspectsSoftware/dataspectsSearch_config.yml
    • DATASPECTS_PLUGINS_FOLDER: /usr/dataspectsSoftware/PLUGINS

Indexing ResourceSilos

DATASPECTS_PLUGINS_FOLDER: /usr/dataspectsSoftware/PLUGINS is the location where plugins are placed. Code in such plugins add and/or overwrite methods in

Plugins can be:

Indexing file system directories

See Feature "Indexing file system directories".

Indexing ResourceSilos providing an API

ResourceSilos that provide an API are indexed by running a dataspects workload script dataspects by RubyGem "dataspects".

Here's an example of a workload script:

user@workstation:~/myIndexingWorkspace$ tree
├── profiles.yml
└── myIndexer.rb
# myIndexer.rb
module Dataspects
  class MyIndexer < Indexer

    # STEP 1: Specify basic indexing components
    def initialize
      @sProfilesURL = 'profiles.yml'
      # The following labels refer items in profiles.yml
        @sTIKAServerLabel = 'dataspectsSystemTIKAServer'
        @sElasticsearchClusterName = "dataspectsSystemESCluster"
        @sResourceSiloName = "localmediawiki"
      @sIndexName = "smwckindex-0"
      # Specify a ResourceSilo, e.g. a MediaWiki
      @oSMW =, @sResourceSiloName, @hOptions)

    # STEP 2: Specify which Resources to index from the
    #         ResourceSilo specified to index
    def ajsonObjectURIs

    # STEP 3: Iterate through all Resources selected for indexing
    #         and apply entitizers (splitting a Resource into a single
    #         or multiple entities) and documentors (creating an
    #         ElasticsearchJSONDocument from an Entity)
    def store_RESOURCENAME(jsonResourceURI)


Federated search engine

webservice serves the endpoints specified in at