MISTI: Metadata-Informed Scientific Text and Image Representation through Contrastive Learning
Published in Proceedings of the Fourth Workshop on Scholarly Document Processing (SDP 2024), 2024
This paper utilized contrastive learning to create better image-text embeddings/representation of scientific figure and captions leading higher performance in information retrieval task. The captions were augmented with metadata found only in scientific manuscripts which resulted in improved grouping between figure/caption within the same field, section, and topic.
Recommended citation: Taechoyotin, P., & Acuna, D. (2024, August). MISTI: Metadata-Informed Scientific Text and Image Representation through Contrastive Learning. In Proceedings of the Fourth Workshop on Scholarly Document Processing (SDP 2024) (pp. 155-164).
Recommended citation: Taechoyotin, P., & Acuna, D. (2024, August). MISTI: Metadata-Informed Scientific Text and Image Representation through Contrastive Learning. In Proceedings of the Fourth Workshop on Scholarly Document Processing (SDP 2024) (pp. 155-164).
Download Paper
