Recent Releases of https://github.com/caleb531/imessage-conversation-analyzer
https://github.com/caleb531/imessage-conversation-analyzer - v2.8.0
- Added regular expression support and case sensitivity support to the
count_phrasesanalyzer- See the README for details
- Python
Published by caleb531 11 months ago
https://github.com/caleb531/imessage-conversation-analyzer - v2.7.0
- Added a new
--result-countoption to themost_frequent_emojisanalyzer - Various cleanup to code and documentation
- Python
Published by caleb531 about 1 year ago
https://github.com/caleb531/imessage-conversation-analyzer - v2.6.0
New Features
- You can now filter any analyzer by date and participant
- These are available in the CLI via new
--from-date,--to-date, and--from-personflags - These are also available in the Python API via new input parameters to
ica.get_dataframes():from_date,to_date, andfrom_person - See the README for details on how to use these new filters
- These are available in the CLI via new
- Added support for iOS 18's emoji-based reactions that allow for reacting with any arbitrary emoji
- This new support is mainly reflected in the Reactions metrics for the
message_totalsanalyzer
- This new support is mainly reflected in the Reactions metrics for the
Fixes
- Fixed some incorrect logic for how YouTube, Spotify, and Apple Music links were counted within the
attachment_totalsanalyzer
Housekeeping
- Added missing documentation for the
count_phrasesanalyzer to the README - Other organizational tweaks and improvements to the README
- Python
Published by caleb531 over 1 year ago
https://github.com/caleb531/imessage-conversation-analyzer - v2.5.0
New Features
- Added a new (built-in)
count_phrasesanalyzer which allow you to count the number of case-insensitive occurrences of any arbitrary strings across all messages in a conversation (excluding reactions)- e.g.
ica -c count_phrases -c 'Jane Fernbrook' 'i love you'
- e.g.
- Added a new
prettify_indexparameter to theica.output_resultsfunction; if you specify it with a value ofFalse, it will disable the default behavior of titleizing index values (see the newcount_phrasesanalyzer for an example)
Deprecations
- The
get_cli_args()function has been deprecated in favor of the newget_cli_parser()method- To migrate, replace
ica.get_cli_args()withica.get_cli_parser().parse_args()across your project files - The
get_cli_parser()function gives you access to the underlyingargparse.ArgumentParserinstance, allowing you to add new CLI arguments specific to your analyzer
- To migrate, replace
Under-the-Hood Improvements
- Upgraded all dependencies to their latest versions
- The CLI now throws an
ImportErrorif a module spec cannot be created (this is unlikely, though) - The
__main__entry point module is now fully tested, increasing the code coverage for the library
- Python
Published by caleb531 over 1 year ago
https://github.com/caleb531/imessage-conversation-analyzer - v2.4.0
- Upgraded dependencies to latest versions
- EDIT: the dependency upgrade actually never got merged; this will be fixed in the next release
- Python
Published by caleb531 over 1 year ago
https://github.com/caleb531/imessage-conversation-analyzer - v2.3.0
- Added a count for audio messages to the
attachment_totalsanalyzer - The exposed
attachmentsdataframe has been updated to include columns for:- The filename of the attachment, if applicable
- The ID of the associated message
- The
messagesdataframe has been updated to include a column for the ID of the message
- Python
Published by caleb531 about 2 years ago
https://github.com/caleb531/imessage-conversation-analyzer - v2.2.0
- Rewrote the mostfrequentemojis analyzer to be substantially faster and more accurate
- The time complexity of the algorithm has been reduced from O(n^2) to O(n), resulting is significant speedups (e.g. 10s to 3s, or 4s to 2s)
- The new algorithm also handles combined emojis correctly (e.g. 👨💻, which is a combination of 👨 and 💻, is now counted correctly)
- Small refactoring improvements to clean up the codebase
- Python
Published by caleb531 over 2 years ago
https://github.com/caleb531/imessage-conversation-analyzer - v2.1.0
- Fixed a bug where ICA could not infer the format from an
*.mdfile extension when passing a Markdown file as an output path - A
FormatNotSupportedErrorhas been added, and is now raised if the specified format is unsupported (either on the CLI via-f/--format, or when callingica.output_resultswith theformatparameter) - Refactored
ica.output_resultstests to be much more robust
- Python
Published by caleb531 over 2 years ago
https://github.com/caleb531/imessage-conversation-analyzer - v2.0.0
ICA v2 is the next major release of the library that represents as significant of a milestone as the initial v1 release! https://pypi.org/project/imessage-conversation-analyzer/
TL;DR
- In addition to the CLI, a comprehensive Python API has been added so that you can write custom programs to integrate with the library more easily
- It adds support for many more emoji
- It fixes some major bugs and makes the tool more intuitive to use
- It adds support for writing to Excel files
- It adds support for non-US phone numbers
- It adds timezone support to eliminate any potential for date/time ambiguity
Python API
Most notably is the addition of a fully-typed Python API which allows you to write custom analyzers that integrates with ICA with greater power and flexibility.
v1 had a concept of "metric files", which were rather limited in capability because they could only be called via the CLI and did not allow for post-processing.
In v2, these "metric files" have been re-dubbed "analyzers" for better clarity, and the new Python API allows for importing of the ica package in your module.
This new API was designed to be adaptable to different kinds of needs. That is, the processing of the message data provided by the library can be as simple or as sophisticated as you'd like. For example, you can either choose to integrate with the built-in CLI, or you can write in your own processing logic.
We encourage you to look at the built-in analyzer modules as examples of how to use this new API.
Improved Emoji Support
Previously, ICA only supported a small subset of emoji for the "Most Frequent" analyzer. ICA v2 adds support for over 1,800 of the emoji supported by the Unicode standard. This should cover the majority of emojis that people use in their message conversations.
Parsing of Typedstream-Encoded Message Data
Certain messages in the macOS message database are encoded using Apple's binary typedstream format in a special attributedBody column. In ICA v1, these types of messages could not be parsed and therefore were excluded from the dataset and from certain analytics (like emoji counts).
In ICA v2, new logic has been added to decode these typedstream-encoded messages and merge them into the main dataset, thanks to help from the pytypedstream package. This means that you can place confidence that ICA will analyze the entirety of your message data for a conversation, not merely a subset of it.
Excel Support
The CLI and the Python API now support outputting your analyzer dataframe to Excel. This is achieved by specifying the new -o/--output flag on the CLI with a file path ending in .xlsx. You can also pass --format=xlsx if you want to capture or redirect the binary output for your own purposes.
For the Python API, you can pass the output parameter to ica.output_results() with an .xlsx file path. Alternatively, you can pass format='excel', with output as a BytesIO object.
sh
ica transcript -c 'Thomas Riverstone' -o ./my_transcript.xlsx
python
ica.output_results(my_df, output='excel')
Timezone Support
Previously, all dates/times in ICA v1 would assume the local system timezone of the user running the CLI. In v2, this is still the default behavior, but a new -t/--timezone option (or timezone parameter for ica.get_dataframes) has been added. This new parameter accepts any IANA timezone name (e.g. America/New_York or UTC).
sh
ica message_totals -c 'John Doe' -t UTC
python
dfs = ica.get_dataframes(contact_name=my_contact_name, timezone='UTC')
Default Format Changes
The default format (i.e. when you omit the --format/-f/format option) has changed slightly from using the tabulate package to using pandas.DataFrame.to_string. This improves the consistency of the API to allow for writing data in the default format to a buffer or file (like other formats).
Before:
```
Date Total
2024-01-26 00:00:00 12 2024-01-27 00:00:00 45 2024-01-28 00:00:00 56 ```
After:
Date Total
2024-01-26 12
2024-01-27 45
2024-01-28 56
Support for Non-US Phone Numbers
ICA v2 now integrates with the phonenumbers package to standardize the parsing of phone numbers when looking up the conversation for a particular contact. A benefit of this integration is that non-US phone numbers are supported.
Dependency Upgrades and Changes
All project dependencies have been updated to their latest versions:
Upgraded (Existing) Dependencies)
- pandas has been upgraded to v2.2.0
tabulate has been upgraded to v0.9.0
New Dependencies
openpyxl (for reading and writing Excel files)
pyarrow (per the recommendation of pandas v2)
phonenumbers (to standardize the parsing of contact phone numbers)
tzlocal (for determining the local timezone of the user's system)
Full Test Suite
ICA v2 adds a full test suite, boasting 96% code coverage across the entire codebase. This includes tests for the core ica package and all built-in analyzers, for both the Python API and the CLI utility. With this, you may have greater confidence that the package will behave correctly in all the relevant cases.
CLI Changes
You may have noticed with the above examples that the Command Line API has also changed slightly. The -m parameter has been dropped in favor of specifying the analyzer name as a single positional parameter.
Before:
sh
ica -c 'John Doe' -m ica/metrics/message_totals.py -f csv
After:
sh
ica message_totals -c 'John Doe' -f csv
Bug Fixes
- Emojis with a count of zero are now excluded from the "Most Frequent Emojis" data
- Dates with no messages sent are now excluded from the "Totals by Day" analyzer
- Fixed "Days Missed" and "Days with No Reply" calculation for the "Message Totals" analyzer
- Fixed compatibility with systems running versions of sqlite3 older than v3.39.0
Beyond that, there are a wealth of other small improvements to refactor and polish up the codebase.
No changes since beta 1; the release notes are largely copied from the beta 1 release notes
- Python
Published by caleb531 over 2 years ago
https://github.com/caleb531/imessage-conversation-analyzer - v2.0.0-beta.1
ICA v2 is the next major release of the library that represents as significant of a milestone as the initial v1 release!
TL;DR
- In addition to the CLI, a comprehensive Python API has been added so that you can write custom programs to integrate with the library more easily
- It adds support for many more emoji
- It fixes some major bugs and makes the tool more intuitive to use
- It adds support for writing to Excel files
- It adds support for non-US phone numbers
- It adds timezone support to eliminate any potential for date/time ambiguity
Python API
Most notably is the addition of a fully-typed Python API which allows you to write custom analyzers that integrates with ICA with greater power and flexibility.
v1 had a concept of "metric files", which were rather limited in capability because they could only be called via the CLI and did not allow for post-processing.
In v2, these "metric files" have been re-dubbed "analyzers" for better clarity, and the new Python API allows for importing of the ica package in your module.
This new API was designed to be adaptable to different kinds of needs. That is, the processing of the message data provided by the library can be as simple or as sophisticated as you'd like. For example, you can either choose to integrate with the built-in CLI, or you can write in your own processing logic.
We encourage you to look at the built-in analyzer modules as examples of how to use this new API.
Improved Emoji Support
Previously, ICA only supported a small subset of emoji for the "Most Frequent" analyzer. ICA v2 adds support for over 1,800 of the emoji supported by the Unicode standard. This should cover the majority of emojis that people use in their message conversations.
Parsing of Typedstream-Encoded Message Data
Certain messages in the macOS message database are encoded using Apple's binary typedstream format in a special attributedBody column. In ICA v1, these types of messages could not be parsed and therefore were excluded from the dataset and from certain analytics (like emoji counts).
In ICA v2, new logic has been added to decode these typedstream-encoded messages and merge them into the main dataset, thanks to help from the pytypedstream package. This means that you can place confidence that ICA will analyze the entirety of your message data for a conversation, not merely a subset of it.
Excel Support
The CLI and the Python API now support outputting your analyzer dataframe to Excel. This is achieved by specifying the new -o/--output flag on the CLI with a file path ending in .xlsx. You can also pass --format=xlsx if you want to capture or redirect the binary output for your own purposes.
For the Python API, you can pass the output parameter to ica.output_results() with an .xlsx file path. Alternatively, you can pass format='excel', with output as a BytesIO object.
sh
ica transcript -c 'Thomas Riverstone' -o ./my_transcript.xlsx
python
ica.output_results(my_df, output='excel')
Timezone Support
Previously, all dates/times in ICA v1 would assume the local system timezone of the user running the CLI. In v2, this is still the default behavior, but a new -t/--timezone option (or timezone parameter for ica.get_dataframes) has been added. This new parameter accepts any IANA timezone name (e.g. America/New_York or UTC).
sh
ica message_totals -c 'John Doe' -t UTC
python
dfs = ica.get_dataframes(contact_name=my_contact_name, timezone='UTC')
Default Format Changes
The default format (i.e. when you omit the --format/-f/format option) has changed slightly from using the tabulate package to using pandas.DataFrame.to_string. This improves the consistency of the API to allow for writing data in the default format to a buffer or file (like other formats).
Before:
```
Date Total
2024-01-26 00:00:00 12 2024-01-27 00:00:00 45 2024-01-28 00:00:00 56 ```
After:
Date Total
2024-01-26 12
2024-01-27 45
2024-01-28 56
Support for Non-US Phone Numbers
ICA v2 now integrates with the phonenumbers package to standardize the parsing of phone numbers when looking up the conversation for a particular contact. A benefit of this integration is that non-US phone numbers are supported.
Dependency Upgrades and Changes
All project dependencies have been updated to their latest versions:
Upgraded (Existing) Dependencies)
- pandas has been upgraded to v2.2.0
tabulate has been upgraded to v0.9.0
New Dependencies
openpyxl (for reading and writing Excel files)
pyarrow (per the recommendation of pandas v2)
phonenumbers (to standardize the parsing of contact phone numbers)
tzlocal (for determining the local timezone of the user's system)
Full Test Suite
ICA v2 adds a full test suite, boasting 96% code coverage across the entire codebase. This includes tests for the core ica package and all built-in analyzers, for both the Python API and the CLI utility. With this, you may have greater confidence that the package will behave correctly in all the relevant cases.
CLI Changes
You may have noticed with the above examples that the Command Line API has also changed slightly. The -m parameter has been dropped in favor of specifying the analyzer name as a single positional parameter.
Before:
sh
ica -c 'John Doe' -m ica/metrics/message_totals.py -f csv
After:
sh
ica message_totals -c 'John Doe' -f csv
Bug Fixes
- Emojis with a count of zero are now excluded from the "Most Frequent Emojis" data
- Dates with no messages sent are now excluded from the "Totals by Day" analyzer
- Fixed "Days Missed" and "Days with No Reply" calculation for the "Message Totals" analyzer
- Fixed compatibility with systems running versions of sqlite3 older than v3.39.0
Beyond that, there are a wealth of other small improvements to refactor and polish up the codebase.
- Python
Published by caleb531 over 2 years ago
https://github.com/caleb531/imessage-conversation-analyzer - v1.2.3
- Fixed the CLI program failing to run due to a number of missing file errors
- Everyone is strongly encouraged to update to this version
- Python
Published by caleb531 over 2 years ago
https://github.com/caleb531/imessage-conversation-analyzer - v1.2.1
- Fixed a critical bug affecting the v1.2.0 distributions where the emojis data was missing, thus causing the
most_frequent_emojisandleast_frequent_emojisto raise an exception.
- Python
Published by caleb531 over 2 years ago
https://github.com/caleb531/imessage-conversation-analyzer - v1.2.0
New Features
- Include Spotify links in attachment totals metric data
Bug Fixes
- Fixed several bug where the package would be unable to find a contact under any the following conditions:
- The last name was not on the contact record
- A phone number was present, but an email address was missing
- An email address was present, but a phone number was missing
- There was leading or trailing whitespace in the contact name given on the command line
Upgrades
- Upgraded pandas from v1.1.2 to v1.3.2
- Upgraded tabulate from v0.8.7 to v0.8.9
- Python
Published by caleb531 over 2 years ago
https://github.com/caleb531/imessage-conversation-analyzer - v1.1.0
- Added a new
conversation_exportmetric file; this is designed to allow easy exporting of an entire iMessage thread (to a format like CSV, for example) - The case of the contact name is now ignored;
--contact-name 'john doe'and--contact-name 'John Doe'are now equivalent - Fixed an
AttributeError: 'NoneType' object has no attribute 'replace'error
- Python
Published by caleb531 about 5 years ago
https://github.com/caleb531/imessage-conversation-analyzer - v1.0.1
- Fixed a bug where
--format csv/-f csvwould not return the data in CSV format
- Python
Published by caleb531 about 5 years ago
https://github.com/caleb531/imessage-conversation-analyzer - v1.0.0
- Initial stable API
- Python
Published by caleb531 over 5 years ago