Recent Releases of https://github.com/caleb531/imessage-conversation-analyzer

https://github.com/caleb531/imessage-conversation-analyzer - v2.8.0

  • Added regular expression support and case sensitivity support to the count_phrases analyzer
    • See the README for details

- Python
Published by caleb531 11 months ago

https://github.com/caleb531/imessage-conversation-analyzer - v2.7.0

  • Added a new --result-count option to the most_frequent_emojis analyzer
  • Various cleanup to code and documentation

- Python
Published by caleb531 about 1 year ago

https://github.com/caleb531/imessage-conversation-analyzer - v2.6.0

New Features

  • You can now filter any analyzer by date and participant
    • These are available in the CLI via new --from-date, --to-date, and --from-person flags
    • These are also available in the Python API via new input parameters to ica.get_dataframes(): from_date, to_date, and from_person
    • See the README for details on how to use these new filters
  • Added support for iOS 18's emoji-based reactions that allow for reacting with any arbitrary emoji
    • This new support is mainly reflected in the Reactions metrics for the message_totals analyzer

Fixes

  • Fixed some incorrect logic for how YouTube, Spotify, and Apple Music links were counted within the attachment_totals analyzer

Housekeeping

  • Added missing documentation for the count_phrases analyzer to the README
  • Other organizational tweaks and improvements to the README

- Python
Published by caleb531 over 1 year ago

https://github.com/caleb531/imessage-conversation-analyzer - v2.5.0

New Features

  • Added a new (built-in) count_phrases analyzer which allow you to count the number of case-insensitive occurrences of any arbitrary strings across all messages in a conversation (excluding reactions)
    • e.g. ica -c count_phrases -c 'Jane Fernbrook' 'i love you'
  • Added a new prettify_index parameter to the ica.output_results function; if you specify it with a value of False, it will disable the default behavior of titleizing index values (see the new count_phrases analyzer for an example)

Deprecations

  • The get_cli_args() function has been deprecated in favor of the new get_cli_parser() method
    • To migrate, replace ica.get_cli_args() with ica.get_cli_parser().parse_args() across your project files
    • The get_cli_parser() function gives you access to the underlying argparse.ArgumentParser instance, allowing you to add new CLI arguments specific to your analyzer

Under-the-Hood Improvements

  • Upgraded all dependencies to their latest versions
  • The CLI now throws an ImportError if a module spec cannot be created (this is unlikely, though)
  • The __main__ entry point module is now fully tested, increasing the code coverage for the library

- Python
Published by caleb531 over 1 year ago

https://github.com/caleb531/imessage-conversation-analyzer - v2.4.0

  • Upgraded dependencies to latest versions
    • EDIT: the dependency upgrade actually never got merged; this will be fixed in the next release

- Python
Published by caleb531 over 1 year ago

https://github.com/caleb531/imessage-conversation-analyzer - v2.3.0

  • Added a count for audio messages to the attachment_totals analyzer
  • The exposed attachments dataframe has been updated to include columns for:
    • The filename of the attachment, if applicable
    • The ID of the associated message
  • The messages dataframe has been updated to include a column for the ID of the message

- Python
Published by caleb531 about 2 years ago

https://github.com/caleb531/imessage-conversation-analyzer - v2.2.0

  • Rewrote the mostfrequentemojis analyzer to be substantially faster and more accurate
    • The time complexity of the algorithm has been reduced from O(n^2) to O(n), resulting is significant speedups (e.g. 10s to 3s, or 4s to 2s)
    • The new algorithm also handles combined emojis correctly (e.g. 👨‍💻, which is a combination of 👨 and 💻, is now counted correctly)
  • Small refactoring improvements to clean up the codebase

- Python
Published by caleb531 over 2 years ago

https://github.com/caleb531/imessage-conversation-analyzer - v2.1.0

  • Fixed a bug where ICA could not infer the format from an *.md file extension when passing a Markdown file as an output path
  • A FormatNotSupportedError has been added, and is now raised if the specified format is unsupported (either on the CLI via -f/--format, or when calling ica.output_results with the format parameter)
  • Refactored ica.output_results tests to be much more robust

- Python
Published by caleb531 over 2 years ago

https://github.com/caleb531/imessage-conversation-analyzer - v2.0.0

ICA v2 is the next major release of the library that represents as significant of a milestone as the initial v1 release! https://pypi.org/project/imessage-conversation-analyzer/

TL;DR

  1. In addition to the CLI, a comprehensive Python API has been added so that you can write custom programs to integrate with the library more easily
  2. It adds support for many more emoji
  3. It fixes some major bugs and makes the tool more intuitive to use
  4. It adds support for writing to Excel files
  5. It adds support for non-US phone numbers
  6. It adds timezone support to eliminate any potential for date/time ambiguity

Python API

Most notably is the addition of a fully-typed Python API which allows you to write custom analyzers that integrates with ICA with greater power and flexibility.

v1 had a concept of "metric files", which were rather limited in capability because they could only be called via the CLI and did not allow for post-processing.

In v2, these "metric files" have been re-dubbed "analyzers" for better clarity, and the new Python API allows for importing of the ica package in your module.

This new API was designed to be adaptable to different kinds of needs. That is, the processing of the message data provided by the library can be as simple or as sophisticated as you'd like. For example, you can either choose to integrate with the built-in CLI, or you can write in your own processing logic.

We encourage you to look at the built-in analyzer modules as examples of how to use this new API.

Improved Emoji Support

Previously, ICA only supported a small subset of emoji for the "Most Frequent" analyzer. ICA v2 adds support for over 1,800 of the emoji supported by the Unicode standard. This should cover the majority of emojis that people use in their message conversations.

Parsing of Typedstream-Encoded Message Data

Certain messages in the macOS message database are encoded using Apple's binary typedstream format in a special attributedBody column. In ICA v1, these types of messages could not be parsed and therefore were excluded from the dataset and from certain analytics (like emoji counts).

In ICA v2, new logic has been added to decode these typedstream-encoded messages and merge them into the main dataset, thanks to help from the pytypedstream package. This means that you can place confidence that ICA will analyze the entirety of your message data for a conversation, not merely a subset of it.

Excel Support

The CLI and the Python API now support outputting your analyzer dataframe to Excel. This is achieved by specifying the new -o/--output flag on the CLI with a file path ending in .xlsx. You can also pass --format=xlsx if you want to capture or redirect the binary output for your own purposes.

For the Python API, you can pass the output parameter to ica.output_results() with an .xlsx file path. Alternatively, you can pass format='excel', with output as a BytesIO object.

sh ica transcript -c 'Thomas Riverstone' -o ./my_transcript.xlsx

python ica.output_results(my_df, output='excel')

Timezone Support

Previously, all dates/times in ICA v1 would assume the local system timezone of the user running the CLI. In v2, this is still the default behavior, but a new -t/--timezone option (or timezone parameter for ica.get_dataframes) has been added. This new parameter accepts any IANA timezone name (e.g. America/New_York or UTC).

sh ica message_totals -c 'John Doe' -t UTC

python dfs = ica.get_dataframes(contact_name=my_contact_name, timezone='UTC')

Default Format Changes

The default format (i.e. when you omit the --format/-f/format option) has changed slightly from using the tabulate package to using pandas.DataFrame.to_string. This improves the consistency of the API to allow for writing data in the default format to a buffer or file (like other formats).

Before:
``` Date Total


2024-01-26 00:00:00 12 2024-01-27 00:00:00 45 2024-01-28 00:00:00 56 ```

After:
Date Total 2024-01-26 12 2024-01-27 45 2024-01-28 56

Support for Non-US Phone Numbers

ICA v2 now integrates with the phonenumbers package to standardize the parsing of phone numbers when looking up the conversation for a particular contact. A benefit of this integration is that non-US phone numbers are supported.

Dependency Upgrades and Changes

All project dependencies have been updated to their latest versions:

Upgraded (Existing) Dependencies)

  • pandas has been upgraded to v2.2.0
  • tabulate has been upgraded to v0.9.0

    New Dependencies

  • openpyxl (for reading and writing Excel files)

  • pyarrow (per the recommendation of pandas v2)

  • phonenumbers (to standardize the parsing of contact phone numbers)

  • tzlocal (for determining the local timezone of the user's system)

    Full Test Suite

ICA v2 adds a full test suite, boasting 96% code coverage across the entire codebase. This includes tests for the core ica package and all built-in analyzers, for both the Python API and the CLI utility. With this, you may have greater confidence that the package will behave correctly in all the relevant cases.

CLI Changes

You may have noticed with the above examples that the Command Line API has also changed slightly. The -m parameter has been dropped in favor of specifying the analyzer name as a single positional parameter.

Before:
sh ica -c 'John Doe' -m ica/metrics/message_totals.py -f csv

After:
sh ica message_totals -c 'John Doe' -f csv

Bug Fixes

  1. Emojis with a count of zero are now excluded from the "Most Frequent Emojis" data
  2. Dates with no messages sent are now excluded from the "Totals by Day" analyzer
  3. Fixed "Days Missed" and "Days with No Reply" calculation for the "Message Totals" analyzer
  4. Fixed compatibility with systems running versions of sqlite3 older than v3.39.0

Beyond that, there are a wealth of other small improvements to refactor and polish up the codebase.

No changes since beta 1; the release notes are largely copied from the beta 1 release notes

- Python
Published by caleb531 over 2 years ago

https://github.com/caleb531/imessage-conversation-analyzer - v2.0.0-beta.1

ICA v2 is the next major release of the library that represents as significant of a milestone as the initial v1 release!

TL;DR

  1. In addition to the CLI, a comprehensive Python API has been added so that you can write custom programs to integrate with the library more easily
  2. It adds support for many more emoji
  3. It fixes some major bugs and makes the tool more intuitive to use
  4. It adds support for writing to Excel files
  5. It adds support for non-US phone numbers
  6. It adds timezone support to eliminate any potential for date/time ambiguity

Python API

Most notably is the addition of a fully-typed Python API which allows you to write custom analyzers that integrates with ICA with greater power and flexibility.

v1 had a concept of "metric files", which were rather limited in capability because they could only be called via the CLI and did not allow for post-processing.

In v2, these "metric files" have been re-dubbed "analyzers" for better clarity, and the new Python API allows for importing of the ica package in your module.

This new API was designed to be adaptable to different kinds of needs. That is, the processing of the message data provided by the library can be as simple or as sophisticated as you'd like. For example, you can either choose to integrate with the built-in CLI, or you can write in your own processing logic.

We encourage you to look at the built-in analyzer modules as examples of how to use this new API.

Improved Emoji Support

Previously, ICA only supported a small subset of emoji for the "Most Frequent" analyzer. ICA v2 adds support for over 1,800 of the emoji supported by the Unicode standard. This should cover the majority of emojis that people use in their message conversations.

Parsing of Typedstream-Encoded Message Data

Certain messages in the macOS message database are encoded using Apple's binary typedstream format in a special attributedBody column. In ICA v1, these types of messages could not be parsed and therefore were excluded from the dataset and from certain analytics (like emoji counts).

In ICA v2, new logic has been added to decode these typedstream-encoded messages and merge them into the main dataset, thanks to help from the pytypedstream package. This means that you can place confidence that ICA will analyze the entirety of your message data for a conversation, not merely a subset of it.

Excel Support

The CLI and the Python API now support outputting your analyzer dataframe to Excel. This is achieved by specifying the new -o/--output flag on the CLI with a file path ending in .xlsx. You can also pass --format=xlsx if you want to capture or redirect the binary output for your own purposes.

For the Python API, you can pass the output parameter to ica.output_results() with an .xlsx file path. Alternatively, you can pass format='excel', with output as a BytesIO object.

sh ica transcript -c 'Thomas Riverstone' -o ./my_transcript.xlsx

python ica.output_results(my_df, output='excel')

Timezone Support

Previously, all dates/times in ICA v1 would assume the local system timezone of the user running the CLI. In v2, this is still the default behavior, but a new -t/--timezone option (or timezone parameter for ica.get_dataframes) has been added. This new parameter accepts any IANA timezone name (e.g. America/New_York or UTC).

sh ica message_totals -c 'John Doe' -t UTC

python dfs = ica.get_dataframes(contact_name=my_contact_name, timezone='UTC')

Default Format Changes

The default format (i.e. when you omit the --format/-f/format option) has changed slightly from using the tabulate package to using pandas.DataFrame.to_string. This improves the consistency of the API to allow for writing data in the default format to a buffer or file (like other formats).

Before:
``` Date Total


2024-01-26 00:00:00 12 2024-01-27 00:00:00 45 2024-01-28 00:00:00 56 ```

After:
Date Total 2024-01-26 12 2024-01-27 45 2024-01-28 56

Support for Non-US Phone Numbers

ICA v2 now integrates with the phonenumbers package to standardize the parsing of phone numbers when looking up the conversation for a particular contact. A benefit of this integration is that non-US phone numbers are supported.

Dependency Upgrades and Changes

All project dependencies have been updated to their latest versions:

Upgraded (Existing) Dependencies)

  • pandas has been upgraded to v2.2.0
  • tabulate has been upgraded to v0.9.0

    New Dependencies

  • openpyxl (for reading and writing Excel files)

  • pyarrow (per the recommendation of pandas v2)

  • phonenumbers (to standardize the parsing of contact phone numbers)

  • tzlocal (for determining the local timezone of the user's system)

    Full Test Suite

ICA v2 adds a full test suite, boasting 96% code coverage across the entire codebase. This includes tests for the core ica package and all built-in analyzers, for both the Python API and the CLI utility. With this, you may have greater confidence that the package will behave correctly in all the relevant cases.

CLI Changes

You may have noticed with the above examples that the Command Line API has also changed slightly. The -m parameter has been dropped in favor of specifying the analyzer name as a single positional parameter.

Before:
sh ica -c 'John Doe' -m ica/metrics/message_totals.py -f csv

After:
sh ica message_totals -c 'John Doe' -f csv

Bug Fixes

  1. Emojis with a count of zero are now excluded from the "Most Frequent Emojis" data
  2. Dates with no messages sent are now excluded from the "Totals by Day" analyzer
  3. Fixed "Days Missed" and "Days with No Reply" calculation for the "Message Totals" analyzer
  4. Fixed compatibility with systems running versions of sqlite3 older than v3.39.0

Beyond that, there are a wealth of other small improvements to refactor and polish up the codebase.

- Python
Published by caleb531 over 2 years ago

https://github.com/caleb531/imessage-conversation-analyzer - v1.2.3

  • Fixed the CLI program failing to run due to a number of missing file errors
    • Everyone is strongly encouraged to update to this version

- Python
Published by caleb531 over 2 years ago

https://github.com/caleb531/imessage-conversation-analyzer - v1.2.1

  • Fixed a critical bug affecting the v1.2.0 distributions where the emojis data was missing, thus causing the most_frequent_emojis and least_frequent_emojis to raise an exception.

- Python
Published by caleb531 over 2 years ago

https://github.com/caleb531/imessage-conversation-analyzer - v1.2.0

New Features

  • Include Spotify links in attachment totals metric data

Bug Fixes

  • Fixed several bug where the package would be unable to find a contact under any the following conditions:
    • The last name was not on the contact record
    • A phone number was present, but an email address was missing
    • An email address was present, but a phone number was missing
    • There was leading or trailing whitespace in the contact name given on the command line

Upgrades

  • Upgraded pandas from v1.1.2 to v1.3.2
  • Upgraded tabulate from v0.8.7 to v0.8.9

- Python
Published by caleb531 over 2 years ago

https://github.com/caleb531/imessage-conversation-analyzer - v1.1.0

  • Added a new conversation_export metric file; this is designed to allow easy exporting of an entire iMessage thread (to a format like CSV, for example)
  • The case of the contact name is now ignored; --contact-name 'john doe' and --contact-name 'John Doe' are now equivalent
  • Fixed an AttributeError: 'NoneType' object has no attribute 'replace' error

- Python
Published by caleb531 about 5 years ago

https://github.com/caleb531/imessage-conversation-analyzer - v1.0.1

  • Fixed a bug where --format csv / -f csv would not return the data in CSV format

- Python
Published by caleb531 about 5 years ago

https://github.com/caleb531/imessage-conversation-analyzer - v1.0.0

  • Initial stable API

- Python
Published by caleb531 over 5 years ago