Fragment Selector Providers
Data.Path identifies a file. Data.Selector can identify a fragment inside that file. Data.SelectorFormat tells ProcessCore which selector language to use when comparing two fragments.
CSV Fragment Selector
The built-in CSV provider understands RFC 7111 row, column, and cell selectors.
With the provider, we can read and write textual selectors into their typed representation.
let provider = CsvFragmentSelectorProvider()
let columnSelector = provider.TryParse "col=1-3"
|
let cellSelector = provider.TryParse "cell=2,2"
|
The provider can also relate two selectors. The relation can be either Exact, Contains, or Disjunct.
In this case, col=1-3 contains cell=2,2.
// same selector is exact match
provider.Relate (columnSelector.Value) (columnSelector.Value)
|
// column contains cell
provider.Relate (columnSelector.Value) (cellSelector.Value)
|
// disjoint columns
provider.Relate (columnSelector.Value) ((provider.TryParse "col=4-6").Value)
|
Providers matter when both data nodes have selectors. Without a registered provider, ProcessCore can see that the paths match but cannot know whether col=1-3 contains cell=2,2.
let dataset = Dataset("fragment-demo")
let exportedColumns = Data(path = "measurements.csv", selector = "col=1-3", selectorFormat = CsvFragmentSelectorProvider.SelectorFormatUri, encodingFormat = "text/csv")
let measuredCell = Data(path = "measurements.csv", selector = "cell=2,2", selectorFormat = CsvFragmentSelectorProvider.SelectorFormatUri, encodingFormat = "text/csv")
let interpretedSample = Material("Interpreted sample", additionalType = "Sample")
let export = LabProcess("Export CSV")
export.AddOutputData(exportedColumns)
let interpret = LabProcess("Interpret selected cell")
interpret.AddInputData(measuredCell)
interpret.AddOutputMaterial(interpretedSample)
dataset.AddProcess(export)
dataset.AddProcess(interpret)
let beforeRegistration =
exportedColumns.DownstreamMaterials(scope = dataset.AllProcesses())
|> Seq.map (fun m -> m.Name)
beforeRegistration
|
Register the provider on the dataset. Registration is stored on the root dataset, so child datasets share the same selector-provider lookup.
dataset.RegisterFragmentSelectorProvider(provider)
// dataset.RegisterFragmentSelectorProvider(provider :> IFragmentSelectorProvider) // also works when upcast to interface
let registeredProvider =
dataset.TryGetFragmentSelectorProvider(CsvFragmentSelectorProvider.SelectorFormatUri)
|> Option.map (fun p -> p.SelectorFormat)
registeredProvider
|
let afterRegistration =
exportedColumns.DownstreamMaterials(scope = dataset.AllProcesses())
|> Seq.map (fun m -> m.Name)
afterRegistration
|
Custom Fragment Selector Providers
The idea behind the inclusion of generic fragment selectors syntax into the ProcessCore is so that any kind of fragment can be defined given a proper fragment selector specification.
In the datamodel, this corresponds to an implementation of the IFragmentSelectorProvider interface, which can be registered on a dataset and will be used to relate any two selectors with the same SelectorFormat.
Usually, you should inherit from FragmentSelectorProviderBase<'Selector>, which implements IFragmentSelectorProvider and requires parsers and typed comparers.
The provider parses strings into a typed selector and returns a semantic relation.
type PrefixSelectorProvider() =
inherit FragmentSelectorProviderBase<string>()
override _.SelectorFormat = "urn:example:prefix-selector"
override _.TryParse(text: string) =
if System.String.IsNullOrWhiteSpace text then None
else Some (text.Trim('/'))
override _.ToSelectorString(selector: string) =
selector
override _.Relate(container: string) (candidate: string) =
if container = candidate then Exact
elif candidate.StartsWith(container + "/") then Contains
else Unknown
let customProviderResult =
let p = PrefixSelectorProvider()
(p :> IFragmentSelectorProvider).TryRelate "assay/table" "assay/table/row/1"
customProviderResult
|
What To Use When
Task |
API |
|---|---|
Mark a file fragment |
|
Use CSV row/column/cell fragments |
|
Enable selector-aware traversal |
|
Inspect registered providers |
|
Add a selector language |
|
type CsvFragmentSelectorProvider = inherit FragmentSelectorProviderBase<CsvFragmentSelector> new: unit -> CsvFragmentSelectorProvider override Relate: container: CsvFragmentSelector -> candidate: CsvFragmentSelector -> FragmentRelation override ToSelectorString: selector: CsvFragmentSelector -> string override TryParse: text: string -> CsvFragmentSelector option override SelectorFormat: string static member SelectorFormatUri: string
<summary> RFC 7111 fragment selector provider for text/csv row, column, and cell fragments. </summary>
--------------------
new: unit -> CsvFragmentSelectorProvider
type Dataset = inherit DynamicObj new: identifier: string * ?name: string * ?description: string * ?additionalType: string * ?processes: LabProcess seq * ?hasPart: Dataset seq * ?additionalProperty: PropertyValue seq -> Dataset member AddAdditionalProperty: pv: PropertyValue -> unit member AddPart: child: Dataset -> unit member AddProcess: proc: LabProcess -> unit member AllConnectedNodes: node: IONode -> ResizeArray<IONode> member AllData: unit -> ResizeArray<Data> member AllMaterials: unit -> ResizeArray<Material> member AllNodes: unit -> ResizeArray<IONode> member AllProcesses: unit -> ResizeArray<LabProcess> ...
<summary> Container and context for data and processes. schema.org/Dataset </summary>
--------------------
new: identifier: string * ?name: string * ?description: string * ?additionalType: string * ?processes: LabProcess seq * ?hasPart: Dataset seq * ?additionalProperty: PropertyValue seq -> Dataset
namespace Microsoft.FSharp.Data
--------------------
type Data = inherit DynamicObj new: path: string * ?selector: string * ?selectorFormat: string * ?encodingFormat: string * ?additionalType: string * ?additionalProperty: PropertyValue seq -> Data member AddAdditionalProperty: pv: PropertyValue -> unit member AllConnectedNodes: ?scope: ResizeArray<LabProcess> -> ResizeArray<IONode> member AllConnectedProcesses: ?scope: ResizeArray<LabProcess> -> ResizeArray<LabProcess> member AllPropertyValues: ?scope: ResizeArray<LabProcess> -> ResizeArray<PropertyValue> member ConnectedData: ?scope: ResizeArray<LabProcess> -> ResizeArray<Data> member ConnectedMaterials: ?scope: ResizeArray<LabProcess> -> ResizeArray<Material> member DownstreamData: ?scope: ResizeArray<LabProcess> -> ResizeArray<Data> member DownstreamMaterials: ?scope: ResizeArray<LabProcess> -> ResizeArray<Material> ...
<summary> Data file produced or consumed by processes. schema.org/MediaObject or File </summary>
--------------------
new: path: string * ?selector: string * ?selectorFormat: string * ?encodingFormat: string * ?additionalType: string * ?additionalProperty: PropertyValue seq -> Data
type Material = inherit DynamicObj new: name: string * ?additionalType: string * ?additionalProperty: PropertyValue seq -> Material member AddAdditionalProperty: pv: PropertyValue -> unit member AllConnectedNodes: ?scope: ResizeArray<LabProcess> -> ResizeArray<IONode> member AllConnectedProcesses: ?scope: ResizeArray<LabProcess> -> ResizeArray<LabProcess> member AllPropertyValues: ?scope: ResizeArray<LabProcess> -> ResizeArray<PropertyValue> member ConnectedData: ?scope: ResizeArray<LabProcess> -> ResizeArray<Data> member ConnectedMaterials: ?scope: ResizeArray<LabProcess> -> ResizeArray<Material> member DownstreamData: ?scope: ResizeArray<LabProcess> -> ResizeArray<Data> member DownstreamMaterials: ?scope: ResizeArray<LabProcess> -> ResizeArray<Material> ...
<summary> Input or output biological, chemical, or digital material in the process graph. bioschemas.org/Sample </summary>
--------------------
new: name: string * ?additionalType: string * ?additionalProperty: PropertyValue seq -> Material
type LabProcess = inherit DynamicObj new: name: string * ?executesProtocol: LabProtocol * ?additionalType: string * ?inputs: IONode seq * ?outputs: IONode seq * ?parameterValue: PropertyValue seq -> LabProcess member AddInput: node: IONode -> unit member AddInputData: d: Data -> unit member AddInputMaterial: m: Material -> unit member AddOutput: node: IONode -> unit member AddOutputData: d: Data -> unit member AddOutputMaterial: m: Material -> unit member AddParameterValue: pv: PropertyValue -> unit member CanonicalizeAllNodes: ds: Dataset -> unit ...
<summary> Core transformation node. Connects inputs to outputs via a protocol. bioschemas.org/LabProcess </summary>
--------------------
new: name: string * ?executesProtocol: LabProtocol * ?additionalType: string * ?inputs: IONode seq * ?outputs: IONode seq * ?parameterValue: PropertyValue seq -> LabProcess
type PrefixSelectorProvider = inherit FragmentSelectorProviderBase<string> new: unit -> PrefixSelectorProvider override Relate: container: string -> candidate: string -> FragmentRelation override ToSelectorString: selector: string -> string override TryParse: text: string -> string option override SelectorFormat: string
--------------------
new: unit -> PrefixSelectorProvider
type FragmentSelectorProviderBase<'Selector> = interface IFragmentSelectorProvider new: unit -> FragmentSelectorProviderBase<'Selector> abstract Relate: container: 'Selector -> candidate: 'Selector -> FragmentRelation abstract ToSelectorString: 'Selector -> string abstract TryParse: string -> 'Selector option abstract SelectorFormat: string
<summary> Typed selector-provider contract for implementations of a selector language. </summary>
--------------------
new: unit -> FragmentSelectorProviderBase<'Selector>
val string: value: 'T -> string
--------------------
type string = System.String
type String = interface seq<char> interface IEnumerable interface ICloneable interface IComparable interface IComparable<string> interface IConvertible interface IEquatable<string> interface IParsable<string> interface ISpanParsable<string> new: value: nativeptr<char> -> unit + 8 overloads ...
<summary>Represents text as a sequence of UTF-16 code units.</summary>
--------------------
System.String(value: nativeptr<char>) : System.String
System.String(value: char array) : System.String
System.String(value: System.ReadOnlySpan<char>) : System.String
System.String(value: nativeptr<sbyte>) : System.String
System.String(c: char, count: int) : System.String
System.String(value: nativeptr<char>, startIndex: int, length: int) : System.String
System.String(value: char array, startIndex: int, length: int) : System.String
System.String(value: nativeptr<sbyte>, startIndex: int, length: int) : System.String
System.String(value: nativeptr<sbyte>, startIndex: int, length: int, enc: System.Text.Encoding) : System.String
System.String.Trim( trimChars: char array) : string
System.String.Trim(trimChar: char) : string
System.String.StartsWith(value: char) : bool
System.String.StartsWith(value: string, comparisonType: System.StringComparison) : bool
System.String.StartsWith(value: string, ignoreCase: bool, culture: System.Globalization.CultureInfo) : bool
<summary> Selector-provider contract used by the core traversal layer. </summary>
ProcessCore