Started adding related works

smsharma · Mar 11, 2024 · 42f3ecc · 42f3ecc
1 parent 61f1dc1
commit 42f3ecc
Show file tree

Hide file tree

Showing 5 changed files with 15 additions and 7 deletions.
diff --git a/README.md b/README.md
@@ -7,5 +7,5 @@
 
 ## Paper draft
 
-[Link to paper draft](https://github.com/smsharma/HubbleCLIP/blob/main-pdf/paper/main.pdf). The PDF is compiled automatically from the `main` branch into the `main-pdf` branch on push.
+[Link to paper draft](https://github.com/smsharma/HubbleCLIP/blob/main-pdf/paper/hubble_paperclip.pdf). The PDF is compiled automatically from the `main` branch into the `main-pdf` branch on push.
 
diff --git a/paper/hubble_paperclip.pdf b/paper/hubble_paperclip.pdf
diff --git a/paper/hubble_paperclip.tex b/paper/hubble_paperclip.tex
@@ -189,9 +189,17 @@ \section{Introduction}
 Our method opens up the possibility of interacting with astronomical survey data using free-form natural language as an interface, which is a cornerstone of the success of the modern foundation model paradigm.
 %
 
-The CLIP family of foundation models, which in their original form embed images and associated captions into a common representation space via contrastive learning, have shown strong performance and generalization capabilities on a variety of downstream tasks including zero-shot classification and image retrieval.
+\paragraph*{Related work}
+
+% The CLIP family of foundation models, which in their original form embed images and associated captions into a common representation space via contrastive learning, have shown strong performance and generalization capabilities on a variety of downstream tasks including zero-shot classification and image retrieval~\citep{radford2021learning}.
+%
+The concept of learning task-agnostic representations via self-supervised learning has been applied within astrophysics \cite{slijepcevic2024radio,stein2021self,hayat2021self,slijepcevic2022learning} and used for downstream tasks like object similar search \citep{stein2021self}, gravitational lens finding \citep{stein2022mining}, estimation of Galactic distances \citep{hayat2021estimating}, and identification of rare galaxies \citep{walmsley2023rare}.
+%
+Associating different modalities via contrastive training has been employed in many other scientific domains~\citep[e.g.,][]{liu2023text,Sanchez-Fernandez2022.11.17.516915,lanusse2023astroclip,cepeda2023geoclip}, and has been shown to be effective in learning semantically meaningful joint representations. Here, we present for the first time an application associating diverse astronomical data with the text modality.
+%
+For a recent review of contrastive learning in astrophysics, see \citet{huertas2023brief}. 
 %
-The concept of associating diverse modalities via contrastive training has been employed in other scientific domains~\citep[e.g.,][]{liu2023text,Sanchez-Fernandez2022.11.17.516915,lanusse2023astroclip,cepeda2023geoclip}, and has been shown to be effective in learning semantically meaningful joint representations. Here, we present for the first time an application associating astronomical data with the text modality.
+% \citet{bowles2023radio,bowles2022new}
 
 The rest of this paper is organized as follows.
 %
@@ -206,7 +214,7 @@ \section{Introduction}
 \begin{figure*}[!t]
 \centering
 \includegraphics[width=0.99\textwidth]{plots/figure.pdf}
-\caption{\changes{Overview of the PAPERCLIP method. (Left) A pre-trained CLIP model is fine-tuned using a dataset of \hubble observations and corresponding proposal abstracts. (Right) The fine-tuned model can then be used for downstream tasks such as observation retrieval (i.e., finding the observations most relevant to a given text query). The proposal abstract snippet here corresponds to proposal ID \href{https://archive.stsci.edu/proposal_search.php?id=16914&mission=hst}{16914}}}
+\caption{\changes{Overview of the PAPERCLIP method. (Left) A pre-trained CLIP model is fine-tuned using a dataset of \hubble observations and corresponding proposal abstracts. (Right) The fine-tuned model can then be used for downstream tasks such as observation retrieval (i.e., finding the observations most relevant to a given text query). The proposal abstract snippet here corresponds to proposal ID \href{https://archive.stsci.edu/proposal_search.php?id=16914&mission=hst}{16914}}.}
 \label{fig:overview}
 \end{figure*}
 
@@ -349,9 +357,9 @@ \section{Methodology}
 %
 With PAPERCLIP, we leverage the strong generalization capabilities demonstrated by pre-trained CLIP models and adapt these to work with domain-specific \hubble data via fine tuning.
 
-\subsection{Contrastive Language-Image Pre-training (CLIP)}
+\subsection{Contrastive Language-Image Pre-training}
 
-CLIP \citep[Contrastive Language-Image Pre-training;][]{radford2021learning} is a multi-modal neural network model pre-trained on a large corpus of image-text pairs via weak supervision using a contrastive loss.
+Contrastive Language-Image Pre-training \citep[CLIP;][]{radford2021learning} is a multi-modal neural network model pre-trained on a large corpus of image-text pairs via weak supervision using a contrastive loss.
 %
 Given a minibatch $\mathcal{B}$ of $|\mathcal{B}|$ image-text pairs $\{(I_i, T_i)\}$, the goal is to align the learned representations of corresponding (positive) pairs $(I_i, T_i)$ while repelling the representations of unaligned (negative) pairs $(I_i, T_{j\neq i})$.
 %
@@ -422,7 +430,7 @@ \subsection{Evaluation Metrics}
 
 We also qualitatively evaluate the learned embeddings through image retrieval (i.e., retrieving the most relevant images from the validation set using natural language queries) and description retrieval (i.e., querying the astrophysical object classes and science use cases most relevant to a given observation, akin to zero-shot classification) experiments. 
 %
-For the description/text retrieval evaluation, we define a list of possible text associations (i.e., classes), which we show in App.~\ref{app:categories}, by querying the \textsc{Claude 2}\footnote{\url{https://claude.ai/}} large language model along with manual curation.
+For the description/text retrieval evaluation, we define a list of possible text associations (i.e., classes), which we show in App.~\ref{app:categories}, by querying the \textsc{Claude 2}\footnote{\url{https://claude.ai/}} large language followed by with manual curation.
 
 \section{Results and Discussion}
 \label{sec:results}

diff --git a/paper/plots/figure.pdf b/paper/plots/figure.pdf
diff --git a/paper/plots/pro.afdesign b/paper/plots/pro.afdesign
Original file line number	Diff line number	Diff line change
Expand Up		@@ -7,5 +7,5 @@

		## Paper draft

		[Link to paper draft](https://github.com/smsharma/HubbleCLIP/blob/main-pdf/paper/main.pdf). The PDF is compiled automatically from the `main` branch into the `main-pdf` branch on push.
		[Link to paper draft](https://github.com/smsharma/HubbleCLIP/blob/main-pdf/paper/hubble_paperclip.pdf). The PDF is compiled automatically from the `main` branch into the `main-pdf` branch on push.