gview

GView is a cross-platform framework for reverse-engineering. Users can leverage the diverse range of available visualization options to effectively analyze and interpret the information.

https://github.com/gdt050579/gview

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.5%) to scientific vocabulary

Keywords

cpp20 malware-analysis malware-research reverse-engineering visualization
Last synced: 4 months ago · JSON representation ·

Repository

GView is a cross-platform framework for reverse-engineering. Users can leverage the diverse range of available visualization options to effectively analyze and interpret the information.

Basic Info
  • Host: GitHub
  • Owner: gdt050579
  • License: mit
  • Language: C++
  • Default Branch: main
  • Homepage:
  • Size: 18.2 MB
Statistics
  • Stars: 47
  • Watchers: 3
  • Forks: 41
  • Open Issues: 89
  • Releases: 27
Topics
cpp20 malware-analysis malware-research reverse-engineering visualization
Created over 4 years ago · Last pushed 4 months ago
Metadata Files
Readme License Citation

README.md

GView

Build icon Unit testing Deploy release

GView framework is a powerful tool for examining files or any data with a defined structure, such as buffers or memory zones.

Logo

General description

GView framework is a powerful tool for examining files or any data with a defined structure, such as buffers or memory zones. Users can leverage the diverse range of available visualization options to effectively analyze and interpret the information.

On the other hand, from the perspective of developers, GView offers a flexible platform to create plugins that can parse various data structures. Developers can harness this capability to develop customized views and enhance the analysis capabilities of GView. By creating plugins, developers can extend the framework's functionality and tailor it to specific data formats or requirements, enabling more efficient and insightful data analysis.

See GView in action

Scenario 1: Malicious Infection A screencast demonstrates how GView can be used to analyze the contents of a compromised system. By examining the network traffic, a security analysts can uncover the methods used by malicious actors to gain unauthorized access. GView's visualization capabilities enable rapid identification of suspicios hints and help to understand the attack's impact. Open video.

Scenario 2: Suspicious Email GView is used to analyze and observe the impact of a brach after a victim has been infected to determine its potential threat. The cause was a suspicious email received. By analyzing the email headers, attachments, and embedded content, security experts can uncover hidden malicious code or phishing attempts. GView's ability to handle various file types and data structures provides a comprehensive view of the email's components, aiding in threat assessment. Open video.

Smart Viewers

Smart viewers are software components designed to display data in various formats or representations. In the context of a data identifier plugin, multiple smart viewers are usually available, with one being designated as the primary viewer. This setup allows users to effortlessly switch between different viewers, selecting the visualization method that most effectively meets their specific needs.

Buffer Viewer

Interprets data as a binary buffer and possesses the ability to identify and highlight specific portions of the buffer using various specifications such as regular expressions, offsets, content patterns, and more. It can effectively detect and highlight strings in both ASCII and Unicode formats. Additionally, the view is equipped with the capability to adjust code pages, enabling the clear representation of characters from diverse languages.

Buffer Viewer

Text Viewer

Interprets data as a sequence of characters arranged in lines, each separated by an identifier. It offers comprehensive support for various line separators, including CR, LF, CRLF, and LFCR. The view also provides flexible options for alignment, allowing different interpretations of the TAB character, and offers customizable wrapping settings for each line.

Text Viewer

Lexical Viewer

This viewer leverages the lexer provided by the data identifier plugin. It offers advanced functionalities such as displaying text with highlighted colors, enabling the folding or collapsing of code blocks, and incorporating diverse refactoring operations like variable and function renaming. Additionally, a data identification plugin can provide a range of language-specific transformations that can be applied to the text.

Lexical Viewer

Image Viewer

Visualize graphical representation of different image formats.

Image Viewer

Table Viewer

Represents data that has a tabular format in an organized manner (CSV, TSV, SQL databases).

Table Viewer

Dissasm Viewer

Presents the content of binary files through a disassembly process that examines the code and deduces relevant details like imported function names, parameter names, and string pointers. This disassembly process relies on the Capstone library. In addition, the data identifier plugin plays a crucial role by providing essential information such as the required decoding method (e.g., x86, x64) and the code's entry point offset.

Dissasm Viewer

Container Viewer

The viewer displays a range of components that can be extracted using the current data identifier plugin. This versatile functionality can be applied in various scenarios, such as extracting files from an archive or extracting streams from a PCAP file.

Container Viewer

Data Identifier Plugins

In the context of data analysis, a data identifier plugin serves as a valuable component capable of examining a buffer, whether in its binary form or translated into a textual representation. Its primary function is to automatically extract relevant and specific information from the buffer based on its type. To facilitate a comprehensive analysis, each data identifier plugin is equipped with dedicated panels designed to display detailed information specific to the identified data type.

| Plugin name | Data Type | Capabilities | | ----------- | --------- | ------------ | | BMP | BMP (Bitmap) format is a widely used file format for storing raster graphics images | Open and view BMP files, providing a graphical representation of the image contained within the file | | CPP | CPP files are files with the .cpp file extension, which indicates that they contain C++ source code | Parse .cpp files, remove comments | | CSV | CSV and TSV files are both common file formats used for storing and organizing tabular data. | Parse .csv or .tsv files, sort columns, resize table, multiple selection, copy & paste | | ELF | ELF (Executable and Linkable Format) files are a standard file format used in many Unix-like operating systems, including Linux | Parse ELF formats (including detailed information about binaries built from GO language), highlight opcodes, extract sections, segments, symbols (static & dynamic) | | ICO | ICO (Icon) files are image files used to represent icons in Windows operating systems | Open and view ICO files, providing a graphical representation of the image(s) contained within the file | | INI | INI files are plain text files with a simple structure consisting of sections, keys, and values | Parse .ini files (TOML), remove comments | | ISO | ISO (International Organization for Standardization) files are archive files that contain an exact copy or image of a CD, DVD, or Blu-ray disc | Parse ISO files (ECMA 119) building a navigable tree from their content | | JOB |JOB files are associated with Task Scheduler, a built-in Windows utility that allows users to schedule and automate tasks on their system. These .job files are binary files that store the configuration and settings for a scheduled task created using Task Scheduler | Parse .job files, highlight each component | | JS | A .js file is a file that contains JavaScript code | Parse .js files, remove comments, const propagation, reverse strings. Work in progress. | | JSON | A JSON (JavaScript Object Notation) file is a file format used to store and exchange structured data in a lightweight and human-readable manner | Parse .json files, converts all keys to uppercase | | JT | Jupiter Tessellation (JT) is a 3D data format which corresponds to ISO 14306:2012 standard | Parse .jt files, highlight each component | | LNK | A .lnk file, also known as a Windows Shortcut file, is a file type used in the Windows operating system to create shortcuts to files, folders, programs, or specific locations within the system | Parse .lnk file, extract all available data (detailed "ExtraData", "LinkTargetIDList" and "LocationInformation"), highlight components | | MACHO | Mach-O (Mach Object) and Mach-O Fat files are file formats used primarily in macOS and iOS systems for executable and object code | Parse Mach-O and Mach-O Fat formats (including detailed information about binaries built from GO language), highlight opcodes, extract sections, segments, symbols (static & dynamic), including validation of digital signature (hashes created on the sections binary) | | PREFETCH | Prefetch files contain metadata and information about the program's file access patterns and dependencies. They include details such as the program's executable file name, related DLL (Dynamic Link Library) files, and the order in which files are accessed during program execution | Parse Prefetch files extracting in an organized manner all the available data (paths, volumes, dependencies) | | MAM | MAM files are Prefetch files stored in a compressed form | Decompresses MAM files | | PCAP | PCAP (Packet Capture) files, also known as pcap files, are a common file format used for capturing and storing network packet data | Support built for HTTP streams (more to follow) | | PDF | PDF (Portable Document Format) files - the ISO standard for electronic documents containing text, images, and embedded objects | Parse and view PDFs; show binary, structural, and text layers; extract text; list objects/streams; flag JavaScript, embedded files, and other potential IOCs | | PE | PE (Portable Executable) files are a file format used by Windows operating systems for executable programs, DLL (Dynamic Link Library) files, and other system components | Parse PE format (including detailed information about binaries built from GO language), highlight opcodes, extract sections, segments, symbols (static & dynamic), including validation of digital signature (CRL revocation on Windows or OpenSSL manual parsing & validation on Unix systems) | | PYEXTRACTOR | PyInstaller combines a Python application and its associated dependencies into a unified package, enabling the execution of the packaged application without the need for a separate Python interpreter or individual module installations | Extract the PyInstaller-generated package of the bundled Python application and its dependencies from the packaged format (ELF, Mach-O, PE) | | VBA | VBA (Visual Basic for Applications) format refers to the file format used to store VBA code modules and associated macros within Microsoft Office documents, such as Excel workbooks, Word documents, PowerPoint presentations, and Access databases | Parse .VB (Visual Basic) files. Work in progress. | | ZIP | ZIP is a widely used file compression and archival format | Parse .zip files building a navigable tree from their content |

Architecture

The architecture flow can be summarized as it follows: 1. Determine whether the content is binary or textual in nature. In the case of textual content, make an attempt to identify the encoding and decode it into a standardized format such as UTF-16.

  1. Explore all available data identifier plugins to locate one capable of interpreting the given data. If no specific plugin is found, fallback to a generic plugin that solely distinguishes between binary and textual files.

  2. The data identifier plugin selects suitable smart viewers tailored to the current data. Users have the flexibility to switch between these plugins and employ them to extract artifacts or data from the existing content.

  3. Repeat steps 1 to 3 for the extracted components. This iterative process relies on artifacts obtained in previous steps to drive subsequent analysis and extraction.

As for the actual code components involved in the project, here are a high-level view and a core view:

alt text for screen readers

alt text for screen readers

Building

Tools used

CMake is used to build the entire project regardless of the platform.README.md Usage of vcpkg in our build pipeline can be seen here.

Supported platforms

Windows

Works out of the box using vcpkg.

OSX

We are using vcpkg. It requires curl installation.

Unfortunately, some vcpkg ports require manual installation via brew package manager of pkg-config before building the project.

Linux (Intel)

Requires pkg-config package.

Works using vcpkg and curl (for vcpkg).

Linux (ARM (M1))

Requires pkg-config package.

Uncomment this line in top level CMakeLists.txt for Linux ARM architectures.

This will require manual installation of ninja (ninja-build). ```

set(ENV{VCPKGFORCESYSTEM_BINARIES} 1)

```

CI/CD

We are using Github Actions ensuring that the project builds on Windows, OSX & Linux and we are working towards creating artifacts, storing them and eventually building a release flow. For static analysis, we are using CodeQL & Microsoft C++ Code Analysis.

Documentation

The project uses Sphinx as the main documentation engine. Sphinx sources can be located under docs folder.

On every commit to main, a compiled version of the Sphinx documentation is published to gh-pages and then to docs.

First run

There's currently a pre-release/beta GitHub CI/CD pipeline that creates an archive for each supported operating system. It can be downloaded and the package can be run (via main binary - GView - but there's a catch depending on the platform). The supported process architecture (via GitHub Actions) is Intel, but it can be built for ARM as well.

Windows

Runs out of the box (tested on Windows 10 x64), just unzip the release archive and run GView.exe.

Linux

Ubuntu 18.04 is deprecated in GitHub actions. Our precompiled binaries doesn't work there (GLIBC version is too old).

We are currently building on Ubuntu 20.04 (Intel) and this allows us to run GView on Ubuntu 20.04 and Ubuntu 22.04.

MacOS / OSX

You'll get this warning for each GView binary from the package you're downloading: macOS cannot verify that this app is free from malware. Chrome downloaded this file today at 22:31. Unfortunately, unless you entirely temporarily disable Gatekeeper (which would put your computer to a risk) you need to manually allow all the binaries, one by one, to run. This process cannot be avoided until we digitally sign the binaries (MacOS signing and notarization).

Start contributing

  • Clone this repository using recurse submodules: bash     git clone --recurse-submodules <your-repo-link/GView.git>

Contributors can install sphinx using pip install -r requirements.txt, this will install Sphinx tooling and sphinx_rtd_theme. Local building is done with make html

After the command executes successfully, the html pages can be found in the build folder.

Owner

  • Name: Gavrilut Dragos
  • Login: gdt050579
  • Kind: user

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: 'GView: A versatile assistant for security researchers'
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Raul
    family-names: Zaharia
    orcid: 'https://orcid.org/0009-0005-6366-9152'
    affiliation: 'Al. I. Cuza University & Bitdefender, Iaşi, Romania'
  - given-names: Dragoş
    family-names: Gavriluţ
    affiliation: 'Al. I. Cuza University & Bitdefender, Iaşi, Romania'
    orcid: 'https://orcid.org/0009-0004-3339-9625'
  - given-names: Gheorghiță
    family-names: Mutu
    orcid: 'https://orcid.org/0009-0007-6998-9469'
    affiliation: 'Al. I. Cuza University & Bitdefender, Iaşi, Romania'
  - affiliation: 'Al. I. Cuza University, Iași, Romania'
    given-names: Dorel
    family-names: Lucanu
    orcid: 'https://orcid.org/0000-0001-8097-040X'
identifiers:
  - type: doi
    value: 10.1016/j.softx.2024.101940
  - type: url
    value: >-
      https://www.sciencedirect.com/science/article/pii/S2352711024003108
repository-code: 'https://github.com/gdt050579/GView'
repository-artifact: 'https://github.com/gdt050579/GView/tags'
abstract: >-
  We propose a tool, GView (Generic View), that is tailored
  to assist the investigation of possible attack vectors by
  providing guided analysis for a broad range of file types
  using automatic artifact identification, extraction,
  inference&coherent correlation, and meaningful&intuitive
  views at different levels of granularity w.r.t. revealed
  information. GView simplifies the analysis of every
  payload in a complex attack, streamlining the workflow for
  security researchers, and increasing the accuracy of the
  analysis. The ’generic’ aspect derives from the fact that
  it accommodates various file types and also features
  multiple visualization modes (that can be automatically
  configured for each specific file type). Our results show
  that the analysis time of an attack is significantly
  reduced by GView, compared to conventional tools used in
  forensics.
keywords:
  - Cybersecurity
  - Automatic artifact identification
  - Intuitive views
  - Coherent data correlation
  - Malware analysis
license: MIT

GitHub Events

Total
  • Create event: 13
  • Release event: 5
  • Issues event: 16
  • Watch event: 7
  • Delete event: 8
  • Member event: 1
  • Issue comment event: 7
  • Push event: 165
  • Pull request review comment event: 6
  • Pull request review event: 24
  • Pull request event: 31
  • Fork event: 26
Last Year
  • Create event: 13
  • Release event: 5
  • Issues event: 16
  • Watch event: 7
  • Delete event: 8
  • Member event: 1
  • Issue comment event: 7
  • Push event: 165
  • Pull request review comment event: 6
  • Pull request review event: 24
  • Pull request event: 31
  • Fork event: 26

Issues and Pull Requests

Last synced: 4 months ago

All Time
  • Total issues: 6
  • Total pull requests: 11
  • Average time to close issues: 3 months
  • Average time to close pull requests: 2 days
  • Total issue authors: 3
  • Total pull request authors: 4
  • Average comments per issue: 0.17
  • Average comments per pull request: 0.0
  • Merged pull requests: 7
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 6
  • Pull requests: 11
  • Average time to close issues: 3 months
  • Average time to close pull requests: 2 days
  • Issue authors: 3
  • Pull request authors: 4
  • Average comments per issue: 0.17
  • Average comments per pull request: 0.0
  • Merged pull requests: 7
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • rzaharia (80)
  • gheorghitamutu (16)
  • xTachyon (2)
  • Cosmin765 (2)
  • raresradua (1)
  • tanlorik (1)
  • valentint8 (1)
Pull Request Authors
  • rzaharia (37)
  • gheorghitamutu (18)
  • D1n03 (4)
  • Cosmin765 (3)
  • delia1205 (1)
  • UnexomWid (1)
  • OpariucRares (1)
  • xTachyon (1)
Top Labels
Issue Labels
Enhancement (33) Bug (22) Development (22) Documentation (4) Low Priority (1)
Pull Request Labels
Enhancement (22) Development (21) Bug (21) Documentation (3)

Dependencies

docs/requirements.txt pypi
  • sphinx *
  • sphinx-rtd-theme *
.github/workflows/ci.yml actions
  • actions/checkout v2 composite
.github/workflows/codeql-analysis.yml actions
  • actions/checkout v2 composite
  • github/codeql-action/analyze v2 composite
  • github/codeql-action/init v2 composite
.github/workflows/create_tag.yml actions
  • actions/checkout v2 composite
  • negz/create-tag v1 composite
.github/workflows/deploy_release.yml actions
  • actions/checkout v2 composite
  • actions/download-artifact v2 composite
  • actions/upload-artifact v3 composite
  • marvinpinto/action-automatic-releases latest composite
.github/workflows/gh-pages.yml actions
  • actions/checkout v1 composite
  • ad-m/github-push-action master composite
  • ammaraskar/sphinx-action master composite
.github/workflows/increase_version.yml actions
  • actions/checkout v2 composite
  • ad-m/github-push-action master composite
.github/workflows/msvc.yml actions
  • actions/checkout v2 composite
  • actions/upload-artifact v3 composite
  • github/codeql-action/upload-sarif v2 composite
  • microsoft/msvc-code-analysis-action v0.1.1 composite
vcpkg.json vcpkg
  • brotli *
  • bzip2 *
  • capstone *
  • freetype *
  • libiconv *
  • libpng *
  • ncurses *
  • openssl *
  • sdl2 *
  • sdl2-ttf *
  • zlib *