Ideas

Students: Instructions on getting started. Right now, we're still preparing for GSoC 2024 and we expect to have a close to complete list of projects by Feb 5, 2024.

If you're a sub-org who wants to join, please read the information for sub-orgs.


MSS - Mission Support System


The Mission Support System (MSS) is a software that is written by scientists in the field of atmospheric science. The purpose is to have a tool that simplifies the process for planning a scientific flight in which parameters of the atmosphere are measured. MSS helps to optimize the scientific outcome of the research flights by displaying the planned flight route and the corresponding model parameters in the same platform for many discussed options. It does therefore reduce somehow the amount of flight hours that is needed to answer a scientific question and thus saves in the end taxpayers money.

CVE Binary Tool


The CVE Binary Tool helps you determine if your system includes known vulnerabilities. You can scan binaries for over 200 common, vulnerable components (openssl, libpng, libxml2, expat and others), or if you know the components used, you can get a list of known vulnerabilities associated with an SBOM or a list of components and versions.

xbitinfo


Xbitinfo is an open-source Python package that enables lossy compression of geo-spatial data based on its information content. Embedded into the pangeo ecosystem, xbitinfo builds on top of xarray and dask and allows for fast compression and analysis of various data formats including netCDF and zarr. Xbitinfo addresses the challenge of increasingly large datasets that are currently created due to increasingly available compute power. Climate simulations with resolutions of sub-km scale with petabytes of output are just one example where xbitinfo can help to keep the dataset manageable.

MNE-Python


MNE-Python software is an open-source Python package for exploring, visualizing, and analyzing human neurophysiological data such as MEG, EEG, sEEG, ECoG, and more. It includes modules for data input/output, preprocessing, visualization, source estimation, time-frequency analysis, connectivity analysis, machine learning, and statistics.

pocketpy


pkpy is a lightweight(~15K LOC) Python interpreter for game scripting, built on C++17 with STL.

It aims to be an alternative to lua for game scripting, with elegant syntax, powerful features and competitive performance. pkpy is extremely easy to embed via a single header file pocketpy.h, without external dependencies.

LPython


LPython is an ahead-of-time compiler for Python written in C++, and it has multiple backends to generate code, including LLVM and C. The compiler has been open-sourced under the BSD license, available at github.com/lcompilers/lpython. It is designed as a library with separate building blocks – the parser, Abstract Syntax Tree (AST), Abstract Semantic Representation (ASR), semantic phase, codegen – that are all exposed to the user or developer in a natural way to make it easy to contribute back. It works on Windows, Linux, and Mac OS. The speed of LPython comes from the high-level optimizations done at the ASR level, as well as the low-level optimizations that the LLVM can do. In addition, it is remarkably easy to customize the backends.

DIPY


DIPY is the paragon 3D/4D+ imaging library in Python. Contains generic methods for spatial normalization, signal processing, machine learning, statistical analysis and visualization of medical images. Additionally, it contains specialized methods for computational anatomy including diffusion, perfusion and structural imaging.

FURY


Free Unified Rendering in pYthon is a Python package that provides a minimalistic but powerful API that enables advanced scientific visualization and 3D animations for scientific research. FURY is a community-driven, open-source, and high-performance scientific visualization library that harnesses the graphics processing unit (GPU) for improved speed, precise interactivity, and visual clarity. It was created to address the growing necessity of high-performance 3D scientific visualization in an easy-to-use API fully compatible with the Pythonic ecosystem. To achieve this, FURY takes ideas from CGI (Computer-Generated Imagery) and game engines to then be deployed for usage in everyday research practices.

PyElastica


PyElastica is the python implementation of Elastica, a free and open-source software project for the simulation of assemblies of slender, one-dimensional bodies using Cosserat rod theory, which provides a powerful and versatile framework for modeling the dynamics of slender structures interacting among themselves and with their environment. We are focused on providing useful simulation tools to the robotics and biomechanics communities to model, control, and visualize how these slender structures evolve and interact.

D-SEAMS


We're an organization centered around growing the molecular dynamics post processing toolkit called d-SEAMS (Deferred Structural Elucidation Analysis for Molecular Simulations).

Seldon-code


Seldon-code is a trio of tools designed to revolutionize opinion dynamics simulations. Our core, a robust C++ engine, Seldon, drives detailed simulations. Robbie, our neural network layer, offers a playground for AI experimentation. Hari-Plotter, the visualization companion, brings data to life.

Borg Collective


We are the Borg Collective and maintain multiple Python-based backup tools that are often used in combination: Borg, Borgmatic and Vorta. The core Borg tool is a deduplicating archiver with compression and deduplication. Vorta is a desktop backp client that integrtes with Linux and macOS desktops. Borgmatic is a wrapper for server systems that also takes care of database backups and pre-backup commands.

Python Argentina


Main 2024 Project: PyZombis online programming course including Web UI, Databases, PyGame (browser based - fully interactive - no server required); Other community projects includes library for government API (invoicing), app for lawyers, and custom proposals also are welcome (contact with mentors first). Python Argentina Civil Association (A.C.PyAr) is a formal non-profit organization of programmers with focus on community projects, mainly for Spanish-speakers & Latin Americans. Spanish is one of the most spoken languages in the world, and our countries often lacks open source software to fulfill regional needs. Our projects want to facilitate tools and resources to students, enthusiasts and professionals, so it is easier to learn and use Python in this region of the world. Also, many of our projects can be extended to other situations, contributing back to the international community.

GNU Mailman


Mailman 3 is free software for managing electronic mail discussion and e-newsletter lists. Many open source denizens will be familiar with Mailman 2.1, All of our current work is on Mailman 3, which was released in 2015, but there's still lots of room for new features and ideas! Mailman 3 is integrated with the web using the Django web framework, making it easy for users to manage their accounts and for list owners to administer their lists in a pleasant modern environment. Mailman 3 supports built-in archiving, automatic bounce processing, content filtering, digest delivery, spam filters, and more. Mailman 3's bundled archive software, HyperKitty, also functions like a web forum and integrates with indexing engines such as Xapian and Whoosh.
The Mailman developers are a moderately diverse group, but we strive for inclusion. We have participated in Google Summer of Code almost every year since 2012, and occasionally supply mentors and org admins to other organizations (including the PSF umbrella org and Systers).

Pwndbg


Pwndbg is a plugin for GDB that improves debugging experience for low-level software developers, hardware hackers, reverse engineers, security researchers or capture the flag security competition players. It helps with all this by providing a colorful TUI showing the user CPU register values, disassembled code, values on the stack memory, backtrace and list of current threads. The colors provide information where given pointers point to, and, the pointers are dereferenced to show what they contain. All this displayed context immediately helps in understanding what is going on in the debugged program. Pwndbg provides lots of useful commands, e.g., for dumping process information, inspecting glibc or linux kernel heap allocator metadata, finding pointers in memory, displaying stack canary/cookie values, getting a hexdump of memory, and many many more. Apart from this, Pwndbg provides an API that can be used to use or extend its features when users need to script some tasks in GDB.

Contributors can propose working on more than one idea, and then adjust time accordingly between them. Some of the projects could also be extended to large length (e.g. support more kernel versions with libslub, or, implement more features for kernel debugging).

Python


Python provides the core of the Python Programming Language. There's a single possible project with them this year:

  • Title: Adopting Hardened Compiler Options for C/C++ in CPython
  • Difficulty: Medium
  • Length: 175hr
  • Skills required: C/C++ development experience
  • Description: This project would reduce the potential for future memory safety vulnerabilities in Python by adopting hardened compiler options in the CPython codebase.

    Task outline:
    • There's already a list of compiler option candidates to adopt, use that as the initial list.
    • Do some performance evaluation for how each compiler option affects performance (using CPython's existing performance suite). Report back on the performance impact of enabling each option.
    • Implement a small custom tool (proposed in the existing issue) that allows ignoring existing violations of compiler options while preventing future violations. At this point we've achieved a lot of value, all future CPython contributions will have these compiler options applied.
    • After the tooling is integrated, fill the rest of the project time by remediating known issues.
Possible mentors:
  • Seth Larson (PSF): seth@python.org
  • Dustin Ingram (GOSST): dii@google.com

Contact Links

Chat(Note: this is making use of the primary Python GSoC chat we may redirect you. E-mailing the above two e-mails is preferred)


Friends of the PSF

Here's some more interesting organizations that use Python!

  • TARDIS TARDIS is an open-source Monte Carlo radiative-transfer spectral synthesis code for 1D models of supernova ejecta. It is designed for rapid spectral modelling of supernovae. It is developed and maintained by a multi-disciplinary team iincluding software engineers, computer scientists, statisticians, and astrophysicists.