CatMatcher Quickstart
1. Module import
[5]:
import os
import pandas as pd
from stilts_wrapper.matcher import StiltsMatcher
2. Setup
To use CatMatcher, users need to provide a path to the directory holding the files they would like to perform a cross-match on. Since the STILTS backend works with relative path variables for the matching process, the users are also prompted to provide a directory name. CatMatcher will then create a directory with that name inside the file directory, along with the file structure needed to run STILTS in the background.
[6]:
file_path = '../../Data/example_files/'
files = [f for f in os.listdir(file_path) if "csv" in f]
files
[6]:
['Megeath_YSOs.csv', 'Nemesis_YSOs_OrionB.csv', 'Disks_NGC2024.csv']
3. Define Match parameters
[7]:
match_values = ["RAJ2000 DEJ2000", "RA DE", "RAJ2000 DEJ2000"] # Names of the columns to match
match_radius = 1 # arcseconds
4. Performing a simple match
All the matching functionality is encapsulated in the StiltsMatcher class.
[8]:
# Initialize Matcher
Nmatcher = StiltsMatcher(
file_list=files,
file_path=file_path,
output_file_name="test.csv", # name of the output file
ifmt="csv",
match_radius=match_radius,
match_values=match_values,
join_mode="match",
)
csv ['test', 'csv']
[9]:
Nmatcher.build_N_match(print_command=True)
Command written to /Users/alena/PycharmProjects/CatMatcher/Data/example_files/CatMatcher_cwd/scripts/Nmatch_commands
stilts tmatchn multimode=group nin=3 matcher=sky params=1 \
in1=../../Megeath_YSOs.csv ifmt1=csv suffix1='_1' values1='RAJ2000 DEJ2000' \
in2=../../Nemesis_YSOs_OrionB.csv ifmt2=csv suffix2='_2' values2='RA DE' \
in3=../../Disks_NGC2024.csv ifmt3=csv suffix3='_3' values3='RAJ2000 DEJ2000' \
join1=match join2=match join3=match \
fixcols=dups out=../matches/test.csv ofmt=csv progress=time
[10]:
Nmatcher.perform_Nmatch()
Command written to /Users/alena/PycharmProjects/CatMatcher/Data/example_files/CatMatcher_cwd/scripts/Nmatch_commands
Output: Current directory: /Users/alena/PycharmProjects/CatMatcher/Data/example_files/CatMatcher_cwd/scripts
Error: Params: Max Error(Number)/arcsec=1.0
Tuning: HEALPix k(Integer)=14
Processing: Split, BasicParallel
Binning rows for table 1......................................................
Time: 0.0s
Binning rows for table 2......................................................
Time: 0.0s
Binning rows for table 3......................................................
Time: 0.0s
Average bin count per row: 1.0820783
4894 row refs in 4311 bins
(average bin occupancy 1.1352354)
Consolidating potential match groups..........................................
Time: 0.0s
Locating pairs................................................................
Time: 0.0s
Eliminating internal links....................................................
Time: 0.0s
Internal links removed: 1
Mapping rows to links.........................................................
Time: 0.0s
Identifying isolated links....................................................
Time: 0.0s
Walking links..............................................................
Time: 0.0s
Eliminating internal links....................................................
Time: 0.0s
Elapsed time for match: 0 seconds
Populate index maps...........................................................
Time: 0.0s
Params: Max Error(Number)/arcsec=1.0
Tuning: HEALPix k(Integer)=14
Processing: Split, BasicParallel
Binning rows for table 1......................................................
Time: 0.0s
Binning rows for table 2......................................................
Time: 0.0s
Binning rows for table 3......................................................
Time: 0.0s
Average bin count per row: 1.0820783
4894 row refs in 4311 bins
(average bin occupancy 1.1352354)
Consolidating potential match groups..........................................
Time: 0.0s
Locating pairs................................................................
Time: 0.0s
Eliminating internal links....................................................
Time: 0.0s
Internal links removed: 1
Mapping rows to links.........................................................
Time: 0.0s
Identifying isolated links....................................................
Time: 0.0s
Walking links..............................................................
Time: 0.0s
Eliminating internal links....................................................
Time: 0.0s
Elapsed time for match: 0 seconds
Populate index maps...........................................................
Time: 0.0s
Return code: 0