Logo ProcessCore

Fragment Selector Providers

Data.Path identifies a file. Data.Selector can identify a fragment inside that file. Data.SelectorFormat tells ProcessCore which selector language to use when comparing two fragments.

CSV Fragment Selector

The built-in CSV provider understands RFC 7111 row, column, and cell selectors.

With the provider, we can read and write textual selectors into their typed representation.

let provider = CsvFragmentSelectorProvider()

let columnSelector = provider.TryParse "col=1-3"
Some (ColumnSelector [{ First = Index 1
                        Last = Index 3 }])
let cellSelector = provider.TryParse "cell=2,2"
Some (CellSelector [{ Rows = { First = Index 2
                               Last = Index 2 }
                      Columns = { First = Index 2
                                  Last = Index 2 } }])

The provider can also relate two selectors. The relation can be either Exact, Contains, or Disjunct. In this case, col=1-3 contains cell=2,2.

// same selector is exact match
provider.Relate (columnSelector.Value) (columnSelector.Value)
Exact
// column contains cell
provider.Relate (columnSelector.Value) (cellSelector.Value)
Contains
// disjoint columns
provider.Relate (columnSelector.Value) ((provider.TryParse "col=4-6").Value)
Disjoint

Providers matter when both data nodes have selectors. Without a registered provider, ProcessCore can see that the paths match but cannot know whether col=1-3 contains cell=2,2.

let dataset = Dataset("fragment-demo")

let exportedColumns = Data(path = "measurements.csv", selector = "col=1-3", selectorFormat = CsvFragmentSelectorProvider.SelectorFormatUri, encodingFormat = "text/csv")
let measuredCell = Data(path = "measurements.csv", selector = "cell=2,2", selectorFormat = CsvFragmentSelectorProvider.SelectorFormatUri, encodingFormat = "text/csv")
let interpretedSample = Material("Interpreted sample", additionalType = "Sample")

let export = LabProcess("Export CSV")
export.AddOutputData(exportedColumns)

let interpret = LabProcess("Interpret selected cell")
interpret.AddInputData(measuredCell)
interpret.AddOutputMaterial(interpretedSample)

dataset.AddProcess(export)
dataset.AddProcess(interpret)

let beforeRegistration =
    exportedColumns.DownstreamMaterials(scope = dataset.AllProcesses())
    |> Seq.map (fun m -> m.Name)

beforeRegistration
seq []

Register the provider on the dataset. Registration is stored on the root dataset, so child datasets share the same selector-provider lookup.

dataset.RegisterFragmentSelectorProvider(provider)
// dataset.RegisterFragmentSelectorProvider(provider :> IFragmentSelectorProvider) // also works when upcast to interface

let registeredProvider =
    dataset.TryGetFragmentSelectorProvider(CsvFragmentSelectorProvider.SelectorFormatUri)
    |> Option.map (fun p -> p.SelectorFormat)

registeredProvider
Some "https://datatracker.ietf.org/doc/html/rfc7111"
let afterRegistration =
    exportedColumns.DownstreamMaterials(scope = dataset.AllProcesses())
    |> Seq.map (fun m -> m.Name)

afterRegistration
seq ["Interpreted sample"]

Custom Fragment Selector Providers

The idea behind the inclusion of generic fragment selectors syntax into the ProcessCore is so that any kind of fragment can be defined given a proper fragment selector specification. In the datamodel, this corresponds to an implementation of the IFragmentSelectorProvider interface, which can be registered on a dataset and will be used to relate any two selectors with the same SelectorFormat.

Usually, you should inherit from FragmentSelectorProviderBase<'Selector>, which implements IFragmentSelectorProvider and requires parsers and typed comparers. The provider parses strings into a typed selector and returns a semantic relation.

type PrefixSelectorProvider() =
    inherit FragmentSelectorProviderBase<string>()

    override _.SelectorFormat = "urn:example:prefix-selector"

    override _.TryParse(text: string) =
        if System.String.IsNullOrWhiteSpace text then None
        else Some (text.Trim('/'))

    override _.ToSelectorString(selector: string) =
        selector

    override _.Relate(container: string) (candidate: string) =
        if container = candidate then Exact
        elif candidate.StartsWith(container + "/") then Contains
        else Unknown

let customProviderResult =
    let p = PrefixSelectorProvider()
    (p :> IFragmentSelectorProvider).TryRelate "assay/table" "assay/table/row/1"

customProviderResult
Some Contains

What To Use When

Task

API

Mark a file fragment

Data.Selector, Data.SelectorFormat

Use CSV row/column/cell fragments

CsvFragmentSelectorProvider

Enable selector-aware traversal

dataset.RegisterFragmentSelectorProvider(provider)

Inspect registered providers

TryGetFragmentSelectorProvider, GetFragmentSelectorProviders

Add a selector language

FragmentSelectorProviderBase<'Selector>

namespace ProcessCore
val provider: CsvFragmentSelectorProvider
Multiple items
type CsvFragmentSelectorProvider = inherit FragmentSelectorProviderBase<CsvFragmentSelector> new: unit -> CsvFragmentSelectorProvider override Relate: container: CsvFragmentSelector -> candidate: CsvFragmentSelector -> FragmentRelation override ToSelectorString: selector: CsvFragmentSelector -> string override TryParse: text: string -> CsvFragmentSelector option override SelectorFormat: string static member SelectorFormatUri: string
<summary> RFC 7111 fragment selector provider for text/csv row, column, and cell fragments. </summary>

--------------------
new: unit -> CsvFragmentSelectorProvider
val columnSelector: CsvFragmentSelector option
override CsvFragmentSelectorProvider.TryParse: text: string -> CsvFragmentSelector option
val cellSelector: CsvFragmentSelector option
override CsvFragmentSelectorProvider.Relate: container: CsvFragmentSelector -> candidate: CsvFragmentSelector -> FragmentRelation
property Option.Value: CsvFragmentSelector with get
val dataset: Dataset
Multiple items
type Dataset = inherit DynamicObj new: identifier: string * ?name: string * ?description: string * ?additionalType: string * ?processes: LabProcess seq * ?hasPart: Dataset seq * ?additionalProperty: PropertyValue seq -> Dataset member AddAdditionalProperty: pv: PropertyValue -> unit member AddPart: child: Dataset -> unit member AddProcess: proc: LabProcess -> unit member AllConnectedNodes: node: IONode -> ResizeArray<IONode> member AllData: unit -> ResizeArray<Data> member AllMaterials: unit -> ResizeArray<Material> member AllNodes: unit -> ResizeArray<IONode> member AllProcesses: unit -> ResizeArray<LabProcess> ...
<summary> Container and context for data and processes. schema.org/Dataset </summary>

--------------------
new: identifier: string * ?name: string * ?description: string * ?additionalType: string * ?processes: LabProcess seq * ?hasPart: Dataset seq * ?additionalProperty: PropertyValue seq -> Dataset
val exportedColumns: Data
Multiple items
namespace Microsoft.FSharp.Data

--------------------
type Data = inherit DynamicObj new: path: string * ?selector: string * ?selectorFormat: string * ?encodingFormat: string * ?additionalType: string * ?additionalProperty: PropertyValue seq -> Data member AddAdditionalProperty: pv: PropertyValue -> unit member AllConnectedNodes: ?scope: ResizeArray<LabProcess> -> ResizeArray<IONode> member AllConnectedProcesses: ?scope: ResizeArray<LabProcess> -> ResizeArray<LabProcess> member AllPropertyValues: ?scope: ResizeArray<LabProcess> -> ResizeArray<PropertyValue> member ConnectedData: ?scope: ResizeArray<LabProcess> -> ResizeArray<Data> member ConnectedMaterials: ?scope: ResizeArray<LabProcess> -> ResizeArray<Material> member DownstreamData: ?scope: ResizeArray<LabProcess> -> ResizeArray<Data> member DownstreamMaterials: ?scope: ResizeArray<LabProcess> -> ResizeArray<Material> ...
<summary> Data file produced or consumed by processes. schema.org/MediaObject or File </summary>

--------------------
new: path: string * ?selector: string * ?selectorFormat: string * ?encodingFormat: string * ?additionalType: string * ?additionalProperty: PropertyValue seq -> Data
property CsvFragmentSelectorProvider.SelectorFormatUri: string with get
val measuredCell: Data
val interpretedSample: Material
Multiple items
type Material = inherit DynamicObj new: name: string * ?additionalType: string * ?additionalProperty: PropertyValue seq -> Material member AddAdditionalProperty: pv: PropertyValue -> unit member AllConnectedNodes: ?scope: ResizeArray<LabProcess> -> ResizeArray<IONode> member AllConnectedProcesses: ?scope: ResizeArray<LabProcess> -> ResizeArray<LabProcess> member AllPropertyValues: ?scope: ResizeArray<LabProcess> -> ResizeArray<PropertyValue> member ConnectedData: ?scope: ResizeArray<LabProcess> -> ResizeArray<Data> member ConnectedMaterials: ?scope: ResizeArray<LabProcess> -> ResizeArray<Material> member DownstreamData: ?scope: ResizeArray<LabProcess> -> ResizeArray<Data> member DownstreamMaterials: ?scope: ResizeArray<LabProcess> -> ResizeArray<Material> ...
<summary> Input or output biological, chemical, or digital material in the process graph. bioschemas.org/Sample </summary>

--------------------
new: name: string * ?additionalType: string * ?additionalProperty: PropertyValue seq -> Material
val export: LabProcess
Multiple items
type LabProcess = inherit DynamicObj new: name: string * ?executesProtocol: LabProtocol * ?additionalType: string * ?inputs: IONode seq * ?outputs: IONode seq * ?parameterValue: PropertyValue seq -> LabProcess member AddInput: node: IONode -> unit member AddInputData: d: Data -> unit member AddInputMaterial: m: Material -> unit member AddOutput: node: IONode -> unit member AddOutputData: d: Data -> unit member AddOutputMaterial: m: Material -> unit member AddParameterValue: pv: PropertyValue -> unit member CanonicalizeAllNodes: ds: Dataset -> unit ...
<summary> Core transformation node. Connects inputs to outputs via a protocol. bioschemas.org/LabProcess </summary>

--------------------
new: name: string * ?executesProtocol: LabProtocol * ?additionalType: string * ?inputs: IONode seq * ?outputs: IONode seq * ?parameterValue: PropertyValue seq -> LabProcess
member LabProcess.AddOutputData: d: Data -> unit
val interpret: LabProcess
member LabProcess.AddInputData: d: Data -> unit
member LabProcess.AddOutputMaterial: m: Material -> unit
member Dataset.AddProcess: proc: LabProcess -> unit
val beforeRegistration: string seq
member Data.DownstreamMaterials: ?scope: ResizeArray<LabProcess> -> ResizeArray<Material>
member Dataset.AllProcesses: unit -> ResizeArray<LabProcess>
module Seq from Microsoft.FSharp.Collections
val map: mapping: ('T -> 'U) -> source: 'T seq -> 'U seq
val m: Material
property Material.Name: string with get, set
member Dataset.RegisterFragmentSelectorProvider: provider: IFragmentSelectorProvider -> unit
val registeredProvider: string option
member Dataset.TryGetFragmentSelectorProvider: selectorFormat: string -> IFragmentSelectorProvider option
module Option from Microsoft.FSharp.Core
val map: mapping: ('T -> 'U) -> option: 'T option -> 'U option
val p: IFragmentSelectorProvider
property IFragmentSelectorProvider.SelectorFormat: string with get
val afterRegistration: string seq
Multiple items
type PrefixSelectorProvider = inherit FragmentSelectorProviderBase<string> new: unit -> PrefixSelectorProvider override Relate: container: string -> candidate: string -> FragmentRelation override ToSelectorString: selector: string -> string override TryParse: text: string -> string option override SelectorFormat: string

--------------------
new: unit -> PrefixSelectorProvider
Multiple items
type FragmentSelectorProviderBase<'Selector> = interface IFragmentSelectorProvider new: unit -> FragmentSelectorProviderBase<'Selector> abstract Relate: container: 'Selector -> candidate: 'Selector -> FragmentRelation abstract ToSelectorString: 'Selector -> string abstract TryParse: string -> 'Selector option abstract SelectorFormat: string
<summary> Typed selector-provider contract for implementations of a selector language. </summary>

--------------------
new: unit -> FragmentSelectorProviderBase<'Selector>
Multiple items
val string: value: 'T -> string

--------------------
type string = System.String
val text: string
namespace System
Multiple items
type String = interface seq<char> interface IEnumerable interface ICloneable interface IComparable interface IComparable<string> interface IConvertible interface IEquatable<string> interface IParsable<string> interface ISpanParsable<string> new: value: nativeptr<char> -> unit + 8 overloads ...
<summary>Represents text as a sequence of UTF-16 code units.</summary>

--------------------
System.String(value: nativeptr<char>) : System.String
System.String(value: char array) : System.String
System.String(value: System.ReadOnlySpan<char>) : System.String
System.String(value: nativeptr<sbyte>) : System.String
System.String(c: char, count: int) : System.String
System.String(value: nativeptr<char>, startIndex: int, length: int) : System.String
System.String(value: char array, startIndex: int, length: int) : System.String
System.String(value: nativeptr<sbyte>, startIndex: int, length: int) : System.String
System.String(value: nativeptr<sbyte>, startIndex: int, length: int, enc: System.Text.Encoding) : System.String
System.String.IsNullOrWhiteSpace( value: string) : bool
union case Option.None: Option<'T>
union case Option.Some: Value: 'T -> Option<'T>
System.String.Trim() : string
System.String.Trim( trimChars: char array) : string
System.String.Trim(trimChar: char) : string
val selector: string
val container: string
val candidate: string
union case FragmentRelation.Exact: FragmentRelation
System.String.StartsWith(value: string) : bool
System.String.StartsWith(value: char) : bool
System.String.StartsWith(value: string, comparisonType: System.StringComparison) : bool
System.String.StartsWith(value: string, ignoreCase: bool, culture: System.Globalization.CultureInfo) : bool
union case FragmentRelation.Contains: FragmentRelation
union case FragmentRelation.Unknown: FragmentRelation
val customProviderResult: FragmentRelation option
val p: PrefixSelectorProvider
type IFragmentSelectorProvider = abstract TryRelate: container: string -> candidate: string -> FragmentRelation option abstract SelectorFormat: string
<summary> Selector-provider contract used by the core traversal layer. </summary>

Type something to start searching.