handwriting-sample
Module for the manipulation with online handwriting data. The package implements HandwritingSample class enabling fast and easy handwriting data-object handling.
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (16.1%) to scientific vocabulary
Keywords
Repository
Module for the manipulation with online handwriting data. The package implements HandwritingSample class enabling fast and easy handwriting data-object handling.
Basic Info
- Host: GitHub
- Owner: BDALab
- License: mit
- Language: Python
- Default Branch: main
- Homepage: https://handwriting-sample.readthedocs.io/en/latest/
- Size: 539 KB
Statistics
- Stars: 6
- Watchers: 1
- Forks: 0
- Open Issues: 8
- Releases: 0
Topics
Metadata Files
README.md
Handwriting Sample
This package provides a PyPi-installable module for the manipulation
with the so-called online handwriting data (handwriting with dynamic information in form of the time-series) acquired
by Wacom Digitizing Tablets. The package implements HandwritingSample class enabling fast and easy handwriting
data-object handling. Handwriting data must consists of 7 following time-series: x, y, timestamp, pen status,
azimuth, tilt, pressure.
Main features: - data load with validation - *.svc, - *.json, - html5 pointer event (with automatic data transformation) - array - pandas dataframe - unit transformation - axis from to mm - time to seconds - angles to degrees - simple access and manipulation with time-series - data storage
The package can be used also for data acquired from any other devices if they satisfied the collection of the above list of time-series.
The full programming sphinx-generated docs can be seen in official documentation.
Contents: 1. Installation 2. Data 3. Examples 4. License 5. Contributors
Installation
pip install handwriting-sample
Data
Input data
Input data must consist of handwriting data in the form of time-series acquired by Wacom Digitizing Tablet. However, other similar devices can be used too, if they satisfy the following data structure:
x: X axisy: Y axistime: timestamp since epochpen_status: pen up or down (0 = up, 1 = down)azimuth: azimuth of the pen tiptilt: tilt of the pen regarding the tablet surfacepressure: pressure
Example of the *.svc database can be found here.
Metadata
To bring more insights for the processed data sample, we support the metadata. Metadata can be read in two forms:
1. (NOT RECOMMENDED) from the file name of SVC file (see SVC file)
2. from the JSON file, part meta_data (see JSON file)
3. from the key: value dictionary using add_meta_data, once the sample has been loaded
(see Examples)
Input data examples
SVC file
full SVC example can be found here
csv
606
4034 7509 354642400 1 1190 720 10852
4034 7509 354642408 1 1180 700 10997
4150 7582 354642416 1 1170 690 11061
4241 7639 354642423 1 1150 670 11077
4362 7714 354642431 1 1130 650 12085
4513 7810 354642438 1 1120 640 13222
4693 7926 354642446 1 1110 640 14278
...
first line in SVC represents the number of samples (lines) in SVC file
SVC Metadata
Metadata are read from the file name with the following convention:
SubjectID_DateOfBirth_Gender_TaskNumber_AdministratorName_DateOfAcquisition.svc
example:
ID002518-07-2014M0007Doe_12-05-2021.svc
JSON file
full JSON example can be found here
json
{
"meta_data":
{
"samples_count": 100,
"column_names": ["x", "y", "time", "pen_status", "azimuth", "tilt", "pressure"],
"administrator": "Doe",
"participant":
{
"id": "BD_1234",
"sex": "female",
"birth_date": "2002-11-05",
},
"task_id": 7,
...
},
"data":
{
"x":[ 52.81, 52.83, 52.855, 52.87, 52.88, 52.89, 52.9, ...],
"y":[ 52.81, 52.83, 52.855, 52.87, 52.88, 52.89, 52.9, ...],
"time":[ 0.0, 0.007, 0.015, 0.022, 0.03, 0.037, 0.045, ...],
"pen_status":[ 1, 1, 1, 1, 1, 1, 1, ... ],
"azimuth":[ 510.0, 510.0, 510.0, 510.0, 510.0, ... ],
"tilt":[520.0, 520.0, 520.0, 520.0, 520.0, ... ],
"pressure": [0.0, 0.01173, 0.022483, 0.035191, 0.056696, ...]
}
}
JSON Metadata
Metadata are read from the "meta_data" section of the JSON file
HTML5 Pointer Event
When using HTML5 Pointer Event data, ensure the proper identification of the time series order.
Time-series order is the same as it comes from the Google Chrome browser.
NOTE: When loading data from HTML5 Pointer Event, data are automatically transformed to the proper units! Please see the section Handwriting Unit Transformation in case of HTML5 Pointer Event
full HTML5 Pointer Event example can be found here
json
{ "x":[417.3515625, 417.3515625, 417.3515625, 416.96484375, 415.91796875, 414.98046875, ... ],
"y":[ 685.80078125, 685.80078125, 685.80078125, 685.47265625, 685.25390625, 685.25390625, ... ],
"time":[ 3982.0999999996275, 3982.0999999996275, 3982.0999999996275, 3995.9000000003725, 4021, ... ],
"pressure":[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...],
"button":[ -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, ... ],
"buttons":[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ... ],
"twist":[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ... ],
"tiltX":[ 44, 44, 44, 44, 46, 46, 49, 49, 49, 49, 49, 48, 48, 48, 48, 48, 49, 49, 49, 49, 49, ... ],
"tiltY":[ 20, 20, 20, 18, 18, 17, 17, 17, 17, 17, 17, 18, 18, 18, 20, 20, 20, 20, 20, 20, 20, ... ],
"pointerType":"pen"
}
Numpy Array
When loading data using numpy array, ensure the proper identification of the time series order. ```python array = numpy.array([[1,1,1,1,0], [1,2,3,4,5], [1,2,3,4,5], [254651615,254651616,254651617,254651618,254651619], [1,2,3,4,5], [1,2,3,4,5], [10,20,30,40,50]])
columnnames = ['penstatus', 'y', 'x', 'time', 'azimuth', 'tilt', 'pressure'] ```
Pandas DataFrame
```python x = [1,2,3,4,5] y = [1,2,3,4,5] time = [254651615,254651616,254651617,254651618,254651619] pen_status = [1,2,3,4,5] azimuth = [1,2,3,4,5] tilt= [1,2,3,4,5] pressure=[10,20,30,40,50]
pandas.DataFrame(numpy.columnstack([x, y, time, penstatus, azimuth, tilt, pressure]))
columnnames = ['x', 'y', 'time', 'penstatus', 'azimuth', 'tilt', 'pressure'] ```
Handwriting Unit Transformation
The package supports all data unit transformation: 1. axis values to mm: for the axis transformation we need to set a Line-Per-Inch (LPI) or Line-Per-Millimeter (LPMM) of the device. This value depends on the device type and RAW data gathering. By default, we are using LPI for conversion 2. time to seconds: from the time since epoch to seconds starting from 0 3. angles to degree: for the angle transformation we need to set maximal theoretical value of raw angle range and maximal value of angle in degrees based on device capabilities 4. pressure normalization: from the RAW pressure values to pressure levels based on device capabilities
By default, package uses predefined technical values for Wacom Cintiq 16 tablet:
| Name | Value |
|---|---|
| LPI | 5080 |
| LPMM | 200 |
| MAXPRESSUREVALUE | 32767 |
| PRESSURELEVELS | 8192 |
| MAXTILTVALUE | 900 |
| MAXTILTDEGREE | 90 |
| MAXAZIMUTHVALUE | 3600 |
| MAXAZIMUTH_DEGREE | 360 |
NOTE
In case of unit transformation ensure you used a proper technical values regarding your device
Handwriting Unit Transformation in case of HTML5 Pointer Event
When loading data from HTML5 Pointer Event, data are automatically transformed to the proper units!
For this particular case the data transformation is inside the HTMLPointerEventReader class instead of the HandwritingSampleTransformer class.
Following default values are used:
| Name | Value |
|------------|----------------|
| DEFAULTPIXELRESOLUTION | (1920, 1080) |
| DEFAULTMMDIMENSIONS | (344.2, 193.6) |
| PXTOMM | 0.1794 |
| DEFAULTTIMECONVERSION | 1000 |
| DEFAULTDEVICEPIXEL_RATIO | 1.0 |
We do not expect any additional unit transformation in this case and default values for Wacom Cintiq 16 are used. Transformation function includes:
axis values to mm
- for transformation pixel values to millimeter we calculate simple ratio
px_to_mm = tablet_width_in_mm / tablet_width_resolution_in_px- in case of Cintiq 16 it is
px_to_mm = 344.2 / 1920 = 0.1794
time to seconds
- default unit from HTML5 Pointer Event is milliseconds
time_in_seconds = time_in_milliseconds / 1000- Moreover we need to set time to 0 as the first value
python times = [(time - html_data.get(TIME)[0]) / 1000 for time in html_data.get(TIME)]
device pixel ratio
- This parameter refers to and compensate screen pixel ration setup know as Scale for Windows, that is usuall set to 150% (1.5) for newer laptops
- default value is 1.0
- in case of Cintiq 16 it is 1.0, but in case of other devices it can be different
device_pixel_ratio = tablet_pixel_resolution[0] / DEFAULT_PIXEL_RESOLUTION[0]device_pixel_ratio = tablet_pixel_resolution[1] / DEFAULT_PIXEL_RESOLUTION[1]
tiltX and tiltY to azimuth and tilt
- default unit from HTML5 Pointer Event is degrees of tiltX and tiltY
- in HandwritingSample we are using azimuth and tilt in degrees
- for tilt and azimuth calculation we need to transform degrees to radians and then extract the angles and transform back to degrees
- moreover, we have to process negative values of angels and create and absolute values
- for more details see function
transform_tilt_xy_to_azimuth_and_tiltin HandwritingSampleTransformer class
NOTE: If you wish to overridde the default values, you can do it by passing the values to the constructor of the HTMLPointerEventReader class using following kwargs: -
transform_x_y_to_mm: True by default -transform_time_to_seconds: True by default -transform_tilt_xy_to_azimuth_and_tilt: True by default -time_conversion: 1000 by default -tablet_pixel_resolution: (1920, 1080) by default -tablet_mm_dimensions: (344.2, 193.6) by default
Examples
Load sample
```python from handwriting_sample import HandwritingSample
load from svc
svcsample = HandwritingSample.fromsvc(path="pathtosvc") print(svc_sample) ```
Load sample from JSON and print some time-series
```python from handwriting_sample import HandwritingSample
load from json
jsonsample = HandwritingSample.fromjson(path="pathtojson") print(json_sample)
print x
print(json_sample.x)
print y
print(json_sample.y)
print trajectory
print(json_sample.xy)
print pressure
print(json_sample.pressure) ```
Strokes
Stroke is one segment of data between the position change of pen up/down.
Return value for all the following methods is tuple with the identification of the movement and object of the
HandwritingSample class.
```python from handwriting_sample import HandwritingSample
load sample
sample = HandwritingSample.fromjson(path="pathto_json")
get all strokes
strokes = sample.get_strokes()
get on surface strokes
strokeonsurface = sample.getonsurface_strokes()
get in air strokes
strokesinair = sample.getinair_strokes() ```
or you just can get the data on surface or in air ```python from handwriting_sample import HandwritingSample
load sample
sample = HandwritingSample.fromjson(path="pathto_json")
get movement on surface
onsurfacedata = sample.getonsurface_data()
get movement in air
inairdata = sample.getinair_data() ```
Unit Transformation
```python from handwriting_sample import HandwritingSample
load sample
sample = HandwritingSample.fromjson(path="pathto_json")
transform axis
sample.transformaxistomm(conversiontype=HandwritingSample.transformer.LPI, lpivalue=5080, shiftto_zero=True)
transform time to seconds
sample.transformtimeto_seconds()
transform angle
sample.transformangleto_degree(angle=HandwritingSample.TILT) ```
or you can transform all unit at once ```python from handwriting_sample import HandwritingSample
load sample
sample = HandwritingSample.fromjson(path="pathto_json")
transform axis
sample.transformallunits() ```
Store Data
If you provide a metadata the filename will be generated automatically, otherwise you need to select a filename. Moreover, you can also store the original data only.
```python from handwriting_sample import HandwritingSample
load sample from svc
sample = HandwritingSample.fromsvc(path="pathto_svc")
store data to json
sample.tojson(path="pathto_storage")
store original raw data to json
sample.tojson(path="pathtostorage", storeoriginal_data=True) ```
Transform RAW database to database with transformed units
For example if you have a database of SVC files with RAW data,
and you want to transform handwriting units of all data, add some metadata,
and store it to JSON.
```python
from handwriting_sample import HandwritingSample
Prepare metadata
metadata = { "protocolid": "pdprotocol2018", "devicetype": "Wacom Cinitq", "devicedriver": "2.1.0", "wintabversion": "1.2.5", "lpi": 1024, "timeseries_ranges": { "x": [0, 1025], "y": [0, 1056], "azimuth": [0, 1000], "tilt": [0, 1000], "pressure": [0, 2048]}}
Go for each file in file list
for file in filepaths: # load sample from svc sample = HandwritingSample.fromsvc(path=file)
# add metadata sample.addmetadata(metadata=metadata)
# transform all units sample.transformallunits()
# store original raw data to json sample.tojson(path="pathto_storage") ```
Data visualisation
Package supports also a visualisations e.g.: ```python from handwriting_sample import HandwritingSample
load sample from svc
sample = HandwritingSample.fromsvc(path="pathto_svc")
transform all units
sample.transformallunits()
Show separate movements
sample.plotseparatemovements()
Show in air data
sample.plotinair()
Show all data
sample.plotalldata() ```
License
This project is licensed under the MIT License - see the LICENSE file for details.
Contributors
This package is developed by the members of Brain Diseases Analysis Laboratory. For more information, please contact the head of the laboratory Jiri Mekyska mekyska@vut.cz or the main developer: Jan Mucha mucha@vut.cz.
Owner
- Name: Brain Diseases Analysis Laboratory
- Login: BDALab
- Kind: organization
- Email: mekyska@vut.cz
- Location: Brno, Czech Republic
- Website: https://bdalab.utko.feec.vutbr.cz/
- Twitter: BDALab_
- Repositories: 5
- Profile: https://github.com/BDALab
BDALab is an international multidisciplinary research group focusing on the objective and quantitative analysis of brain diseases.
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." authors: - family-names: "Mucha" given-names: "Jan" orcid: "https://orcid.org/0000-0001-5126-440X" - family-names: "Galaz" given-names: "Zoltan" orcid: "https://orcid.org/0000-0002-8978-351X" - family-names: "Zvoncak" given-names: "Vojtech" orcid: "https://orcid.org/0000-0002-1948-4653" - family-names: "Mekyska" given-names: "Jiri" orcid: "https://orcid.org/0000-0002-6195-193X" title: "Handwriting sample" version: 1.0.5 date-released: 2023-11-03 url: "https://github.com/BDALab/handwriting-sample/"
GitHub Events
Total
- Push event: 1
Last Year
- Push event: 1
Committers
Last synced: almost 3 years ago
All Time
- Total Commits: 92
- Total Committers: 5
- Avg Commits per committer: 18.4
- Development Distribution Score (DDS): 0.467
Top Committers
| Name | Commits | |
|---|---|---|
| DzanMucha | m****o@p****z | 49 |
| Jan Mucha | m****a@v****z | 22 |
| Zoltán Galáž | x****0@g****m | 12 |
| Jan Mucha | m****i@g****m | 6 |
| Zoltán Galáž | z****n@g****u | 3 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 7 months ago
All Time
- Total issues: 1
- Total pull requests: 8
- Average time to close issues: N/A
- Average time to close pull requests: 11 months
- Total issue authors: 1
- Total pull request authors: 1
- Average comments per issue: 0.0
- Average comments per pull request: 0.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 1
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 1
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
- zgalaz (9)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 67 last-month
- Total dependent packages: 2
- Total dependent repositories: 1
- Total versions: 6
- Total maintainers: 1
pypi.org: handwriting-sample
Handwriting sample
- Homepage: https://github.com/BDALab/handwriting-sample
- Documentation: https://handwriting-sample.readthedocs.io/
- License: MIT
-
Latest release: 1.0.6
published 7 months ago
Rankings
Maintainers (1)
Dependencies
- matplotlib *
- numpy *
- pandas *