{ "cells": [ { "cell_type": "markdown", "source": [ "# `CatMatcher` Quickstart\n", "\n", "### 1. Module import" ], "metadata": { "collapsed": false } }, { "cell_type": "code", "execution_count": 5, "outputs": [], "source": [ "import os\n", "import pandas as pd\n", "from stilts_wrapper.matcher import StiltsMatcher" ], "metadata": { "collapsed": false, "ExecuteTime": { "end_time": "2025-05-15T15:41:41.783435Z", "start_time": "2025-05-15T15:41:41.774523Z" } } }, { "cell_type": "markdown", "source": [ "### 2. Setup\n", "\n", "To use `CatMatcher`, users need to provide a path to the directory holding the files they would like to perform a cross-match on.\n", "Since the `STILTS` backend works with relative path variables for the matching process, the users are also prompted to provide a directory name. `CatMatcher` will then create a directory with that name inside the file directory, along with the file structure needed to run `STILTS` in the background." ], "metadata": { "collapsed": false } }, { "cell_type": "code", "execution_count": 6, "outputs": [ { "data": { "text/plain": "['Megeath_YSOs.csv', 'Nemesis_YSOs_OrionB.csv', 'Disks_NGC2024.csv']" }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "file_path = '../../Data/example_files/'\n", "files = [f for f in os.listdir(file_path) if \"csv\" in f]\n", "\n", "files" ], "metadata": { "collapsed": false, "ExecuteTime": { "end_time": "2025-05-15T15:41:41.783919Z", "start_time": "2025-05-15T15:41:41.777560Z" } } }, { "cell_type": "markdown", "source": [ "### 3. Define Match parameters" ], "metadata": { "collapsed": false } }, { "cell_type": "code", "execution_count": 7, "outputs": [], "source": [ "match_values = [\"RAJ2000 DEJ2000\", \"RA DE\", \"RAJ2000 DEJ2000\"] # Names of the columns to match\n", "match_radius = 1 # arcseconds" ], "metadata": { "collapsed": false, "ExecuteTime": { "end_time": "2025-05-15T15:41:41.784646Z", "start_time": "2025-05-15T15:41:41.782096Z" } } }, { "cell_type": "markdown", "source": [ "### 4. Performing a simple match\n", "\n", "All the matching functionality is encapsulated in the `StiltsMatcher` class." ], "metadata": { "collapsed": false } }, { "cell_type": "code", "execution_count": 8, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "csv ['test', 'csv']\n" ] } ], "source": [ "# Initialize Matcher\n", "Nmatcher = StiltsMatcher(\n", " file_list=files,\n", " file_path=file_path,\n", " output_file_name=\"test.csv\", # name of the output file\n", " ifmt=\"csv\",\n", " match_radius=match_radius,\n", " match_values=match_values,\n", " join_mode=\"match\",\n", ")" ], "metadata": { "collapsed": false, "ExecuteTime": { "end_time": "2025-05-15T15:41:41.791099Z", "start_time": "2025-05-15T15:41:41.785482Z" } } }, { "cell_type": "code", "execution_count": 9, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Command written to /Users/alena/PycharmProjects/CatMatcher/Data/example_files/CatMatcher_cwd/scripts/Nmatch_commands\n", "stilts tmatchn multimode=group nin=3 matcher=sky params=1 \\\n", "\tin1=../../Megeath_YSOs.csv ifmt1=csv suffix1='_1' values1='RAJ2000 DEJ2000' \\\n", "\tin2=../../Nemesis_YSOs_OrionB.csv ifmt2=csv suffix2='_2' values2='RA DE' \\\n", "\tin3=../../Disks_NGC2024.csv ifmt3=csv suffix3='_3' values3='RAJ2000 DEJ2000' \\\n", "\tjoin1=match \tjoin2=match join3=match \\\n", "\tfixcols=dups out=../matches/test.csv ofmt=csv progress=time\n" ] } ], "source": [ "Nmatcher.build_N_match(print_command=True)" ], "metadata": { "collapsed": false, "ExecuteTime": { "end_time": "2025-05-15T15:41:41.822473Z", "start_time": "2025-05-15T15:41:41.788785Z" } } }, { "cell_type": "code", "execution_count": 10, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Command written to /Users/alena/PycharmProjects/CatMatcher/Data/example_files/CatMatcher_cwd/scripts/Nmatch_commands\n", "Output: Current directory: /Users/alena/PycharmProjects/CatMatcher/Data/example_files/CatMatcher_cwd/scripts\n", "\n", "Error: Params: Max Error(Number)/arcsec=1.0\n", "Tuning: HEALPix k(Integer)=14\n", "Processing: Split, BasicParallel\n", "Binning rows for table 1......................................................\n", "Time: 0.0s\n", "Binning rows for table 2......................................................\n", "Time: 0.0s\n", "Binning rows for table 3......................................................\n", "Time: 0.0s\n", "Average bin count per row: 1.0820783\n", "4894 row refs in 4311 bins\n", "(average bin occupancy 1.1352354)\n", "Consolidating potential match groups..........................................\n", "Time: 0.0s\n", "Locating pairs................................................................\n", "Time: 0.0s\n", "Eliminating internal links....................................................\n", "Time: 0.0s\n", "Internal links removed: 1\n", "Mapping rows to links.........................................................\n", "Time: 0.0s\n", "Identifying isolated links....................................................\n", "Time: 0.0s\n", "Walking links..............................................................\n", "Time: 0.0s\n", "Eliminating internal links....................................................\n", "Time: 0.0s\n", "Elapsed time for match: 0 seconds\n", "Populate index maps...........................................................\n", "Time: 0.0s\n", "Params: Max Error(Number)/arcsec=1.0\n", "Tuning: HEALPix k(Integer)=14\n", "Processing: Split, BasicParallel\n", "Binning rows for table 1......................................................\n", "Time: 0.0s\n", "Binning rows for table 2......................................................\n", "Time: 0.0s\n", "Binning rows for table 3......................................................\n", "Time: 0.0s\n", "Average bin count per row: 1.0820783\n", "4894 row refs in 4311 bins\n", "(average bin occupancy 1.1352354)\n", "Consolidating potential match groups..........................................\n", "Time: 0.0s\n", "Locating pairs................................................................\n", "Time: 0.0s\n", "Eliminating internal links....................................................\n", "Time: 0.0s\n", "Internal links removed: 1\n", "Mapping rows to links.........................................................\n", "Time: 0.0s\n", "Identifying isolated links....................................................\n", "Time: 0.0s\n", "Walking links..............................................................\n", "Time: 0.0s\n", "Eliminating internal links....................................................\n", "Time: 0.0s\n", "Elapsed time for match: 0 seconds\n", "Populate index maps...........................................................\n", "Time: 0.0s\n", "\n", "Return code: 0\n" ] } ], "source": [ "Nmatcher.perform_Nmatch()" ], "metadata": { "collapsed": false, "ExecuteTime": { "end_time": "2025-05-15T15:41:43.232Z", "start_time": "2025-05-15T15:41:41.792491Z" } } } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.6" } }, "nbformat": 4, "nbformat_minor": 0 }