Recent Releases of https://github.com/alibaba/alink

https://github.com/alibaba/alink - Alink version 1.6.2

Release version 1.6.2

- Java
Published by xuyang1706 over 2 years ago

https://github.com/alibaba/alink - Alink version 1.6.1

Optimize performance and fixed bugs.

- Java
Published by xuyang1706 almost 3 years ago

https://github.com/alibaba/alink - Alink version 1.6.0

Optimize performance and fixed bugs.

- Java
Published by xuyang1706 over 3 years ago

https://github.com/alibaba/alink - Alink version 1.5.8

  1. Add more rules about exception.
  2. Add more graph algorithms.

- Java
Published by xuyang1706 over 3 years ago

https://github.com/alibaba/alink - Alink version 1.5.7

  1. Add pipeline model support in model stream.
  2. Refine online learning.
  3. Fixed some bugs.

- Java
Published by xuyang1706 over 3 years ago

https://github.com/alibaba/alink - Alink version 1.5.6

  1. Add partition in ak, csv and parquet source/sink.
  2. Add model stream as initial model in online learning
  3. Add lazyVizDive and lazyVizStatistics
  4. Add hbase connector
  5. Add custom path in resource plugin

- Java
Published by xuyang1706 over 3 years ago

https://github.com/alibaba/alink - Alink version 1.5.5

  1. Add datahub catalog.
  2. Add Kernel Dense Estimate
  3. Fixed some bugs

- Java
Published by xuyang1706 almost 4 years ago

https://github.com/alibaba/alink - Alink version 1.5.4

  1. Add parquet source
  2. Add Export2FileSinkStreamOp https://www.yuque.com/pinshu/alinktutorial/bookjava32_5
  3. Add Model File Path on predictop and pipeline.
  4. Add TFRecordDataset source/sink
  5. Add ONNX predict operators

- Java
Published by xuyang1706 almost 4 years ago

https://github.com/alibaba/alink - Alink version 1.5.3

  1. Add xgboost wrapper using plugin
  2. Add PyTorch model predictor, close #207.
  3. Support pyalink on vvp. https://www.yuque.com/pinshu/alinktutorial/flinkvvp
  4. Support pyalink on dsw. https://www.yuque.com/pinshu/alinktutorial/paidsw
  5. Support pyalink on pai-designer. https://www.yuque.com/pinshu/alinktutorial/paidesigner
  6. Add serializer of MTable and Tensor types
  7. Add code of pyalink, close #208.
  8. Fix #199, #193.

- Java
Published by xuyang1706 almost 4 years ago

https://github.com/alibaba/alink - Alink version 1.5.2

  1. Improve the performance of model stream.
  2. Add ToTensor, ToVector, ToMTable and support tensor, vector, mtable types in csv source.
  3. Keras sequential operators now support string/int types; Improve plugin mechanism in TF predictor.
  4. Add redis to plugin and add lookup redis operator.
  5. PyAlink: StreamOperator print function now supports specifying port.

- Java
Published by xuyang1706 about 4 years ago

https://github.com/alibaba/alink - Alink version 1.5.1

  1. Improve the performance of dl module.
  2. Resolve many issues on Windows platform.
  3. Add incremental training mode for LR, Softmax etc.
  4. Improve the performance of graph-based random walk algorithms.

- Java
Published by xuyang1706 about 4 years ago

https://github.com/alibaba/alink - Alink version 1.5.0

  1. Add timeseries ○ Add prophet model. #176 ○ Add AutoArima, Arima, HoltWinters, AutoGarch ○ Add LSTNet, DeepAR
  2. Add deep learning module (Linux and MacOSX Intel chips)
  3. Add MTable, Tensor
  4. Add resource plugin
  5. Improve usage of PyFlink in PyAlink, close #178

- Java
Published by xuyang1706 over 4 years ago

https://github.com/alibaba/alink - Alink version 1.4.0

  1. Adapt flink 1.13.
  2. Fixed some bugs
  3. Add some feature engineering methods
  4. Refine the documents of BatchOp/StreamOp
  5. Add java demoes

- Java
Published by xuyang1706 over 4 years ago

https://github.com/alibaba/alink - Alink version 1.3.2

Release note: 1. SLF4J load error when run Java example #109 2. one hot encode a little optimization #112 3. Add quoting in mysql column name #159 4. fix some error (decimal exception & partition set invalid) #162 5. Fix partition overwrite in hive. 6. Upgrade the flink version from 1.12.0 to 1.12.1

- Java
Published by xuyang1706 almost 5 years ago

https://github.com/alibaba/alink - Alink version 1.3.1

  1. Adapt flink 1.12.
  2. Add plugin of kafka.
  3. Add s3 file system.
  4. Add odps catalog.
  5. Fix poisson and add glm model info.
  6. Add multi-files in pipeline loader and local predictor loader.
  7. Use legacy serializer to compatible with old ak format.
  8. Change vector type as CompositeType and Change Sparse vector as pojo type.
  9. Remove the REGEXP_REPLACE in sql selector for flink 1.12

- Java
Published by xuyang1706 about 5 years ago

https://github.com/alibaba/alink - Alink version 1.3.0

  1. Add more model info batch op and support print model info in pipeline model.
  2. Add recommendation module.
    • Supported recommender are:
      • Als
      • Factorization Machines
      • ItemCF
      • UserCF
    • Supported others function for recommendation module are:
      • Leave k-object out
      • Leave top k-object out
      • Ranking evaluation
      • Multi-Label evaluation
  3. Add online learning algorithoms.
    • ftrl model filter
  4. Add a series of similarity algorithms.
    • VectorNearestNeighbor
    • TextSimilarity
    • TextNearestNeighbor
    • TextApproxNearestNeighbor
    • StringSimilarity
    • StringNearestNeighbor
    • StringApproxNearestNeighbor
  5. Add DocWordCountBatchOp,KeywordsExtractionBatchOp, TfidfBatchOp,WordCountBatchOp
  6. Add KNN
  7. Add GeoKMeans, Streaming Kmeans
  8. Add model selctor algorithms.
    • RandomSearchCV
    • RandomSearchTVSplit
  9. Add plugin in filesystem and catalog. Add catalogs of hive, mysql, derby and sqlite
  10. PyAlink:
    • Align with new functionalities in Java side, including new operators, catalog, plugin mechanism, and so on;
    • For Flink version 1.9, PyAlink now depends on PyFlink directly, resulting in supporting flink run, and table-related operetions.
  11. Fix some issues, optimize performance and add more parameters in linear and tree model
  12. Add test utils module and optimize performance of unit tests.
  13. Remove the db module.
  14. Refine the save/load in pipeline and pipeline model. Use Ak as the default format for save/load.
  15. Support load LocalPredictor from Ak file which saved on filesystem. This will avoid collect when load the LocalPredictor. see #78 #79
  16. Add multi-threads in all mapper
  17. Optimize memory usage of batch prediction.
  18. Add pseudoInverse in matrix
  19. Support that the sparse vector has not size
  20. Fix sequencing issue when linkFrom the model info batch op
  21. Optimize the format of lazy print.
  22. Add Stopwatch and TimeSpan
  23. Add serialVersionUID in all serializable classes.

- Java
Published by xuyang1706 about 5 years ago

https://github.com/alibaba/alink - Alink version 1.2.0

  1. Adapt for Flink 1.11
    • Flink API calls (#129), Hive connectors (#130) and kafka connector(#129) are adapted for Flink 1.11.
    • Adjust FilePath of FileSystem for Flink 1.11 #131
  2. Add Factorization Machines classification and regression #115
  3. Support Lazy APIs for higher user interactivity and richer information. Lazy APIs enable intermediate outputs of the ML pipeline to be printed, collected, and post-processed along with the mainstream of data process. Such intermediate outputs include: ML model and training information, evaluation metrics, data statistics, etc.
    • PyAlink supported
    • Support Lazy APIs for BatchOperators and related methods in EstimatorBase/TransformerBase #116
    • Add model information:
      • Linear model #118 #132
      • Tree model #125
      • PCA #117
      • ChisqSelector #117
      • VectorChisqSelector #117
      • KMeans #120
      • BisectingKMeans #120
      • NaiveBayes #122
      • Lda #122
      • GaussianMixture #120
      • OneHotEncoder #120
      • QuantileDiscretizer #120
      • MinMaxScaler #122
      • VectorMinMaxScaler #122
      • MaxAbsScaler #122
      • VectorMaxAbsScaler #122
      • StandardScaler #122
      • VectorStandardScaler #122
    • Add training information:
      • word2vec #125
    • Add statistics:
      • Correlation #117
      • Summary #117
    • Add EvaluationMetrics #124
  4. Add FileSystem APIs. #126 Using FileSystem APIs, users can process files on different file systems with unified and friendly experience. Such processing can be exists, isDir, list, read, write or other commonly functions used for files. Supported file system are:
    • HDFS
    • OSS
    • Local
  5. Add Ak source/sink and Csv source/sink support new FileSystem APIs. #126 Ak is a file format storing data together with its schema that can be written to filesystem. It makes the advantages of compressed, tabular data representation.The supported APIs are shown in the table below:

    | | HDFS | OSS | Local | | :---: | :---: | :---: | :---: | | Ak source | ✔️ | ✔️ | ✔️ | | Ak sink | ✔️ | ✔️ | ✔️ | | Csv source | ✔️ | ✔️ | ✔️ | | Csv sink | ✔️ | ✔️ | ✔️ |

  6. Support EqualWidthDiscretizer. #123

  7. Feature Enhancements and API unification in Clustering. #121

  8. Refine code of QuantileDiscretizer and OneHotEncoder #111

  9. Fix predict stream op in alspredictstreamop.md #104

- Java
Published by xuyang1706 over 5 years ago

https://github.com/alibaba/alink - Alink version 1.1.2

  1. Add transformers among formats Vector, CSV, Json, KV, Columns and Triple #93 • Support AnyToAny transformation • Unified transformation params and easy use.
  2. Support SQL select statements in the Pipeline and LocalPredictor #61 • Support flink planner built-in functions regarding individual rows: comparison, logical, arithmetic, string, temporal, conditional, type conversion, hash, etc. • Add alinkshaded/shadedprotobuf_java to support usage of native Calcite.
  3. Support Hive source and sink #96 • Support Batch/Stream source&sink of Hive. • Support partition of table. • Simplify the dependence of Hive jar. • Support multiple versions: 2.0, 2.1, 2.2, 2.3, 3.0
  4. Fix PyAlink starting and UDF issues on Windows #76, #77
  5. Support BigInteger type in MySql source #86
  6. Add open and close in mapper. #92
  7. Add open function in SegmentMapper and StopwordsRemoverMapper #94
  8. Unify HandleInvalid Params #95

- Java
Published by xuyang1706 over 5 years ago

https://github.com/alibaba/alink - Alink version 1.1.1

Enhancements & New Features

  1. Optimize conversion between operator and dataframe
  2. Auto-detect localIp when useRemoveEnv
  3. Add enum type parameter #65 • Adapt enum type params in quantile, distance and decision tree. #67 • linear model train params change to enum #71 • Kafka, StringIndexer and Join add enum parameters #72 • Adapt enum type params in pca, chi square test, glm and correlation. #73
  4. streamop window group by #68
  5. Add operators to parse strings in CSV, JSON and KV formats to columns #70
  6. Tokenizer supports string split with multiple spaces #69
  7. Make error message clear when selected columns are not found #66
  8. Add an FTRL example #64 ## Fix & Refinements
  9. Fix dill version conflict
  10. ALSExample error #33
  11. Bug of HasVectorSize alias #56
  12. mysqlsource error when i use collect method #45

- Java
Published by xuyang1706 almost 6 years ago

https://github.com/alibaba/alink - Alink version 1.1.0

Enhancements & New Features

  • Improvement of UDF/UDTF operators, Java and PyAlink have consistent usage and behaviors. #32 #44.
  • Publish to maven central and PyPI.
  • Support Flink 1.10 and Flink 1.9. #46
    • https://github.com/alibaba/Alink/releases/tag/v1.1.0-flink-1.10
    • https://github.com/alibaba/Alink/releases/tag/v1.1.0-flink-1.9
  • Support more Kafka connectors. #41.

API change

  • Modify Naive Bayes algorithm as a text classifier. #47
  • Modify and enhance the parameter, model in QuantileDiscretizer, OneHotEncoder and Bucketizer. #48

Documentation

  • Update data links in docs and codes. #28
  • Update PyAlink install instructions. #8

Fix & Refinements

  • Fix the problem in LDA online method and refine comments in FeatureLabelUtil. #29
  • Fit the bug that initial data of KMeansAssignCluster is not cleared. #31
  • Fix the int overflow bug in reading large csv file, and dd test cases for CsvFileInputSplit. See #27
  • Cleanup some code. #15
  • Remove a redundant test case whose data source is unaccessible. see #28
  • Fix the NEP in PCA. see #42

PyPI support

  • Support PyAlink installation using pip install pyalink

Maven Dependencies

Alink is now synchronized to the Maven central repository, which you can easily add to Maven projects.

With Flink-1.10

xml <dependency> <groupId>com.alibaba.alink</groupId> <artifactId>alink_core_flink-1.10_2.11</artifactId> <version>1.1.0</version> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-streaming-scala_2.11</artifactId> <version>1.10.0</version> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-table-planner_2.11</artifactId> <version>1.10.0</version> </dependency>

With Flink-1.9

xml <dependency> <groupId>com.alibaba.alink</groupId> <artifactId>alink_core_flink-1.9_2.11</artifactId> <version>1.1.0</version> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-streaming-scala_2.11</artifactId> <version>1.9.0</version> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-table-planner_2.11</artifactId> <version>1.9.0</version> </dependency>

- Java
Published by xuyang1706 almost 6 years ago

https://github.com/alibaba/alink - Alink version 1.0.1

[fix] Remove the dependency of net.sf.json-lib:json-lib.

see #7 

[fix] Remove useless operators [add] PyAlink: Support OneVsRest [fix] PyAlink: Fix classpath issue in Windows usage. [update] Update the document to release 1.0.1

- Java
Published by xuyang1706 about 6 years ago

https://github.com/alibaba/alink - Alink version 1.0

Alink is the Machine Learning algorithm platform based on Flink, developed by the PAI team of Alibaba computing platform.

- Java
Published by xuyang1706 about 6 years ago