Object localization can be defined as the task of finding the bounding boxes of objects in a scene. Most of the state-of-the-art approaches utilize meticulously handcrafted training datasets. In this work, we are aiming to create a generative adversarial reinforcement learning framework, which can work without having any explicit bounding box information. Instead of relying on bounding boxes, our framework uses tightly cropped object images as training data. Our image localization framework consists of two parts: a reinforcement learning agent (RL agent) and a discriminator. The RL agent takes input scenes and crops them with the objective of creating a tightly cropped object image. The discriminator tries to distinguish whether the image is generated by the RL agent or it comes from a tightly cropped object database. Experiments indicate that it is possible to achieve a promising localization performance without having explicit bounding box data. It can be concluded that generative adversarial reinforcement learning is an important tool in dealing with other learning problems where explicit input/output paired data is not available.