【DeepLearning研修】Transformerの基礎と應用 --第3回 Transformerの畫像での應用

影片類型: 一般
發布日期/時間: 2024年11月28日 17:00
動畫長さ: 30:20
觀看次數: 436回
點讚數: 17
コメント數: -
エンゲージメント率: 3.9%
データ確認日時: 2024年12月5日 13:31

動畫概要

本動畫は「    • 【DeepLearning研修】Transformerの基礎と應用   」の第3回の動畫です。Transformerの畫像での應用について説明しています．また自然言語との融合でどのようなタスクができるようになったかを説明します。
資料はslideshareで公開しています（https://www.slideshare.net/slideshow/...

【參考文獻】
・Deep Residual Learning for Image Recognition
https://arxiv.org/abs/1512.03385
・An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
https://arxiv.org/abs/2010.11929
・ON THE RELATIONSHIP BETWEEN SELF-ATTENTION AND CONVOLUTIONAL LAYERS
https://arxiv.org/abs/1911.03584
・Image Style Transfer Using Convolutional Neural Networks
https://ieeexplore.ieee.org/document/...
・Are Convolutional Neural Networks or Transformers more like human vision
https://arxiv.org/abs/2105.07197
・HOW DO VISION TRANSFORMERS WORK?
https://arxiv.org/abs/2202.06709
・Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization
https://arxiv.org/abs/1610.02391
・Quantifying Attention Flow in Transformers
https://arxiv.org/abs/2005.00928
・Transformer Interpretability Beyond Attention Visualization
https://arxiv.org/abs/2012.09838
・End-to-End Object Detection with Transformers
https://arxiv.org/abs/2005.12872
・SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers
https://arxiv.org/abs/2105.15203
・Training data-efficient image transformers & distillation through attention
https://arxiv.org/abs/2012.12877
・Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
https://arxiv.org/abs/2103.14030
・Masked Autoencoders Are Scalable Vision Learners
https://arxiv.org/abs/2111.06377
・Emerging Properties in Self-Supervised Vision Transformers
https://arxiv.org/abs/2104.14294
・Scaling Laws for Neural Language Models
https://arxiv.org/abs/2001.08361
・Learning Transferable Visual Models From Natural Language Supervision
https://arxiv.org/abs/2103.00020
・Scaling Rectified Flow Transformers for High-Resolution Image Synthesis
https://arxiv.org/abs/2403.03206
・Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models
https://arxiv.org/abs/2402.17177
・SSII2024技術マップ
https://confit.atlas.jp/guide/event/s...

ソニーが提供するオープンソースのディープラーニング（深層學習）フレームワークソフトウェアのNeural Network Libraries（ https://nnabla.org/, https://github.com/sony/nnabla/ ）に關連する情報を紹介する動畫チャンネルを開設しました（    / nnabla   ）。Neural Network Librariesのチュートリアル・Tipsに加え、最先端のディープラーニングの技術情報（講義、最先端論文紹介）などを發信していきます。チャンネル登録と應援よろしくおねがいします！

同じくソニーが提供する直感的なGUIベースの深層學習開發環境のNeural Network Console（ https://dl.sony.com/ ）が發信する大人氣のYouTubeチャンネル（    / @neuralnetworkconsole   ）でもディープラーニングの技術講座やツールのチュートリアルを多數公開しています。こちらもチャンネル登録と應援よろしくおねがいします。

nnabla ディープラーニングチャンネル

【DeepLearning研修】Transformerの基礎と應用 --第3回 Transformerの畫像での應用

動畫概要

最新新聞

熱門新聞2026.06.06～